cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Efficiently move multiple files with dbutils.fs.mv command on abfs storage

Dean_Lovelace
New Contributor III

As part of my batch processing I archive a large number of small files received from the source system each day using the dbutils.fs.mv command.

This takes hours as dbutils.fs.mv moves the files one at a time.

How can I speed this up?

1 ACCEPTED SOLUTION

Accepted Solutions

daniel_sahal
Honored Contributor III
1 REPLY 1

daniel_sahal
Honored Contributor III

@Dean Lovelace​ 

You can use multithreading.

See example here: https://nealanalytics.com/blog/databricks-spark-jobs-optimization-techniques-multi-threading/

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.