DELTA Merge taking too much Time

FAHADURREHMAN
New Contributor III

Hi Legends, 

I have a timeseries DELTA table having 707.1GiB, 7702 files, 262 Billion rows. (Mainly its timeseries data). This table is clustered on 2 columns (Timestamp col & 2nd one is descriptive column)

I have designed a pipeline which runs every week and MERGEs source cdc data into above DELTA table.

However it takes ages to perform MERGE operation. For example i executed pipeline last night and its almost 16 Hours have passed and yet its not completed. M i doing anything wrong here?? And any suggestion how to make DELTA MERGE faster? 😞

 

Cheers

 

 

anuj_lathi
Databricks Employee
Databricks Employee

Great question -- slow MERGE is one of the most common Delta Lake performance issues. Here's a systematic checklist:

1. Partition Pruning in the MERGE Condition

The #1 cause of slow MERGEs is missing the partition column in your ON clause. If your target table is partitioned by, say, date, your merge condition must include it:

MERGE INTO target t

USING source s

ON t.date = s.date AND t.id = s.id   -- includes partition column

WHEN MATCHED THEN UPDATE SET ...

WHEN NOT MATCHED THEN INSERT ...

 

Without it, Spark scans every partition looking for matches -- even if the source data only touches a tiny slice.

2. Enable Low Shuffle Merge

Low Shuffle Merge (LSM) dramatically reduces the amount of data rewritten by only touching files that actually have changes:

SET spark.databricks.delta.merge.lowShuffle.enabled = true;

 

This avoids the classic problem where MERGE rewrites files even when no rows in them changed.

3. Z-ORDER on Join Columns

If your table isn't partitioned (or the merge key isn't the partition column), Z-ORDER on the columns used in your ON clause:

OPTIMIZE target_table ZORDER BY (id);

 

This co-locates related data in fewer files, so the merge only reads/rewrites a small number of files.

4. Right-size Your Source Data

If the source (incoming) data is small relative to the target, broadcast it:

from pyspark.sql.functions import broadcast

spark.conf.set("spark.sql.autoBroadcastJoinThreshold", "100m")

 

If the source is large, consider breaking it into micro-batches by partition.

5. File Compaction

Too many small files in the target table forces the merge to open/scan thousands of files:

OPTIMIZE target_table;

 

Run this periodically. Check file count with:

DESCRIBE DETAIL target_table;

 

6. Liquid Clustering (Recommended for New Tables)

If you're on a recent DBR, consider migrating to Liquid Clustering instead of traditional partitioning + Z-ORDER. It automatically handles file layout:

ALTER TABLE target_table CLUSTER BY (id, date);

 

7. Check the Spark UI

Look at the SQL tab in the Spark UI for your MERGE job. Key things to check:

  • Number of files scanned vs. number of files rewritten -- if these are close to the total file count, you're not pruning
  • Shuffle read/write sizes -- large shuffles indicate the join is scanning too broadly
  • Skew in task durations -- one slow task can bottleneck everything

Quick Diagnostic Query

DESCRIBE HISTORY target_table LIMIT 5;

 

Look at operationMetrics -- it shows numTargetFilesAdded, numTargetFilesRemoved, numTargetRowsUpdated, etc. If files removed is approximately equal to total files, you're rewriting the whole table on every merge.

References

Anuj Lathi
Solutions Engineer @ Databricks

View solution in original post

Thsnk @anuj_lathi , Should I enable Partitioning and Z-Ordering and move away from Clustering for the target table?