cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Query on DBFS migration

Harsh1
New Contributor II

We are doing DBFS migration. In that we have a folder 'user' in Root DBFS having data 5.8 TB in legacy workspace. We performed AWS CLi Sync/cp between Legacy to Target and again performed the same between Target bucket to Target dbfs   

While implementing this technique we migrated the folders that were in /mnt and /dbfs-root to target root bucket. While migrating the /dbfs-root (user, FileStore, home) we encountered a problem it seems to be very slow while moving /dbfs/user

/user - 5.8TB

/home - 680 GB

/FileStore - 181 GB 

Note - This is only slow while performing the migration from Target S3 bucket to /dbfs/user 

Status Update on /dbfs/user till now:

Data Migration Status - 750 GB / 5.8 TB

Completion Rate ~12.9 %

Data transfer by AWS sync till now : ~403 GB

We are pretty curious as it is only happening for the user and it tends to be very slow. Around 200 GB a Day. But this was not the scenario for /home and /FileStore.

Please suggest best practices to mount /user folder to target workspace when looking at this data.

Methods already used:

  1. dbutils.fs.cp()
  2. aws s3 sync
  3. aws s3 cp
2 REPLIES 2

Hubert-Dudek
Esteemed Contributor III

dbutils.fs.cp() and other dbutils commands will be slow as they use single core only.

Consider using AWS data sync shorturl.at/FNQTV

Harsh1
New Contributor II

Thanks for the quick response.

Regarding the suggested AWS data sync approach, we have tried data sync in multiple ways, it is creating folders in s3 bucket itself not on DBFS. As our task is to copy from bucket to DBFS.

It seems that it only supports bucket level operations not DBFS level.

Please suggest any best practices/approach which can cater our needs. That'll be a great help. Thanks.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.