Hey everyone,
We’ve all been there: a Delta Lake MERGE job that should take 20 minutes drags on for 90 minutes, while a full overwrite of the same table finishes in under 20. When an overwrite outpaces a selective merge, it's a massive red flag that your pipeline is doing too much heavy lifting under the hood. This usually happens because the engine is scanning unnecessary partitions and opening thousands of small files.
To fix this, you must explicitly force partition pruning by calculating your source date bounds upfront and passing them as deterministic literals directly into your MERGE condition. Joining on target.date = source.date isn't enough for the optimizer; adding hardcoded ranges dropped our target scan from 580 partitions to just 31.
Additionally, if your high-cardinality merge keys are scattered randomly across partitions, a single update forces Spark to rewrite hundreds of files. You can combat this I/O overhead by running a targeted OPTIMIZE with ZORDER strictly scoped to your active ingestion window.
If your SLAs are slipping, immediately check your numTargetFilesScanned metric via table history and look for small average file sizes using DESCRIBE DETAIL. I published a full architectural deep dive on how to read these Spark UI metrics and why this occurs over on Medium: Databricks MERGE Was 5x Slower Than an Overwrite — The Hidden Mistake That Was Killing Our SLA.
What strategies are you all using to keep your Delta merges selective as your datasets scale past hundreds of millions of rows? Let's discuss below!