Databricks Community

MuesLee · ‎03-04-2025

Hello. I want to do a merge on a subset of my delta table partitions to do incremental upserts to keep two tables in sync. I do not use a whenNotMatchedBySource statement to clean up stale rows in my target because of this GitHub Issue

Because of that I am just focussing on update and insert in my first step. In this first step I am facing an issue that I need help with. As prerequisite for my merge I have extracted the most recent and relevant partitions (last X partitions) of the dataframe that I want to sync my delta table with and created a spark dataframe that only contains the relvant partitions called "updates" (to "prune" the source) (sidenote: delta table and spark dataframe are both having equivalent schema).

My merge statement looks like this:

deltaTable_target.alias("target").merge(df_subset_relevant.alias("source"), f"target.PARTITION IN ({partition_list}) AND target.hashkey = source.HASHKEY").whenNotMatchedInsertAll().whenMatchedUpdateAll(condition = "target.col_that_can_change <> source.col_that_can_change").execute()

The merge statement itself works correctly according to the log files of the delta table (everything is inserted and updated correctly). Yet, when merging it marks even files as touched (and rewrites them) that do not have any changes in them (to be more specific I know that one of the partitions is exactly the same between source and target and it gets rewritten completely).

My assumption is that this is forced by the general merge statement that is used to partition prune the target table. At least I cannot imagine any other reason for that.

In an ideal world the numTargetRowsCopied size in the logs should be the size of the missing partitions and not the size of all partitions that are part of the merge, am I correct?

I mean maybe I do not understand the inner workings correclty and this is a necessary step. Any insight in this problem is highly appreciated.

Brahmareddy · ‎03-10-2025

Hi MuesLee,

How are you doing today?, as per my understanding, Yes, your understanding is mostly correct. The reason even unchanged partitions are being rewritten is likely because of how Delta Lake’s merge operation handles partition pruning and updates. Even though your merge condition is filtering partitions using target.PARTITION IN ({partition_list}), Delta Lake still rewrites entire matching partitions if any row within them needs to be updated. This is because Delta operates at the file level, not row level, meaning that if even one row in a partition is updated, the entire partition file may get rewritten.

To minimize this, you could try Z-Ordering on hashkey to improve file layout and reduce the number of files rewritten. Another approach is using Optimized Writes and Change Data Feed (CDF) if available, to track and update only changed rows more efficiently. While Delta’s merge isn’t as fine-grained as traditional row-based databases, understanding that it rewrites files at the partition level can help you design better strategies for minimizing unnecessary rewrites.

Regards,

Brahma