cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Parallel read of many delta tables

leobocci
New Contributor

I need to read many delta tables in azure object storage (block blobs). There is no root object delta table, but rather many fragmented delta tables that share a common schema but not common paths.

Iterating over the paths with a for loop is performing very poorly because the list of paths is long and the operation can't be parallelized.

The final result should be a union of all delta tables read from a list of paths. The issue is that I cannot pass directly a list of paths to spark.read.load, because this results in the exception:
Databricks Delta does not support multiple input paths in the load() API. To build a single DataFrame by loading multiple paths from the same Delta table, please load the root path of the Delta table with the corresponding partition filters. If the multiple paths are from different Delta tables, please use Dataset's union()/unionByName() APIs to combine the DataFrames generated by separate load() API calls.

1 REPLY 1

raphaelblg
Honored Contributor II

Hello @leobocci ,

In order to read multiple Delta tables, multiple read operations are required. You can trigger the read operations simultaneously through the Job Workflows, DLT, Databricks CLI, DBSQL, Interactive Clusters and other resources.

If the problem is the performance while listing the table paths, I'm afraid there's nothing we can do to improve filesystem read/list operations performance as these are not fully managed by Databricks . 

 

Best regards,

Raphael Balogo
Sr. Technical Solutions Engineer
Databricks

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group