cancel
Showing results for 
Search instead for 
Did you mean: 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results for 
Search instead for 
Did you mean: 

Issue with Validation After DBFS to Volume Migration in Databricks Workspace

Sudheer2
New Contributor III

Hello Databricks Community,

I have successfully migrated my DBFS (Databricks File System) from a source workspace to a target workspace, moving it from a path in Browse DBFS -> Folders to a Catalog -> Schema -> Volume.

Now, I want to validate the migration to ensure that everything was copied correctly. Specifically, I want to check:

  • Folder sizes
  • Folder names and subdirectories
  • Ensure that the folder sizes in the source and target workspaces match
  • Verify that folder names and structure in the source match those in the target workspace

I attempted some validation scripts, but they haven’t worked as expected. Could someone guide me on how I can perform these checks to ensure the migration was successful?

I appreciate your help in suggesting a better approach or providing a working solution.

Thank you!

3 REPLIES 3

Alberto_Umana
Databricks Employee
Databricks Employee

Hello @Sudheer2,

You can use the dbutils.fs.ls command to list the folder names and subdirectories in both the source and target workspaces.

For example:

def get_folder_structure(path):
folders = dbutils.fs.ls(path)
folder_names = [folder.name for folder in folders]
return folder_names

source_folders = get_folder_structure("dbfs:/source_folder_path")
target_folders = get_folder_structure("target_folder_path")

assert source_folders == target_folders, "Folder structures do not match"

Sudheer2
New Contributor III

Hi @Alberto_Umana 


"Thanks for the suggestion! However, since I'm now working with volumes in Unity Catalog (after migrating from DBFS), I need to list folders and subfolders inside a volume located in catalogs and schemas in Unity Catalog, not just in DBFS.

Since Unity Catalog doesn't expose directories the same way as DBFS does, I would like to know if there's a way to list the contents of a volume (which now holds the migrated data) in a similar way to using dbutils.fs.ls() for DBFS folders. Could you guide me on how I can list the folder structure within volumes in Unity Catalog (catalog -> schema -> volume), especially after migration from Hive Metastore to Unity Catalog?

Thanks in advance for your help!"

Alberto_Umana
Databricks Employee
Databricks Employee

Hi @Sudheer2,

thanks for your comments, you can try using %sh magic to list the folder and sub-directores using unix-like commands 

for example:

Alberto_Umana_0-1736266471184.png

 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group