cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Delta sharing speed

turtleXturtle
New Contributor II

Hi - I am comparing the performance of delta shared tables and the speed is 10X slower than when querying locally.

Scenario:

I am using a 2XS serverless SQL warehouse, and have a table with 15M rows and 10 columns, using the below query:

select date, count(*) as num_rows, sum(spend) as total_spend
from catalog.schema.table
group by date
order by 1

I have an account on AWS us-east-1 and AWS us-west-2 for testing.  I am using an R2 bucket in ENAM for the share.

Test: 

If I run on the normal delta table in account 1, this returns in 1 second.

If I deep clone into an R2 bucket and then query the deep cloned table, that also returns in 1 second.

If I delta share the R2 table to account 2, and then query there, that returns in 10 seconds.

If I create a copy of the shared table in account 2, that returns in 1 second.

Question

Is this speed difference expected? Am I doing something wrong or is best practice to copy delta shared tables to local storage (defeating a big benefit of delta sharing)?

0 REPLIES 0

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local communityโ€”sign up today to get started!

Sign Up Now