cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Delta table - version number change on merge

SKC01
New Contributor II

I am running a merge with pyspark on a delta table in which nothing is getting updated in the target table. Still target table version is incremented when I check the table history. Is that expected behavior?

1 ACCEPTED SOLUTION

Accepted Solutions

Kaniz
Community Manager
Community Manager

Hi @SKC01 ,

Yes, this is the expected behaviour. Even if no rows are updated in the target Delta table during a merge operation, the version of the table will still be incremented. This is because a merge operation in Delta Lake is considered a transaction that can potentially modify the table, and every transaction in Delta Lake, whether it modifies the data or not, results in a new version of the table. 

In the provided information, the merge operation is used to upsert data from a source table into a target Delta table. Even if no rows match the condition and no updates are made, this operation is still considered a transaction and will increment the version of the target table.

View solution in original post

2 REPLIES 2

Kaniz
Community Manager
Community Manager

Hi @SKC01 ,

Yes, this is the expected behaviour. Even if no rows are updated in the target Delta table during a merge operation, the version of the table will still be incremented. This is because a merge operation in Delta Lake is considered a transaction that can potentially modify the table, and every transaction in Delta Lake, whether it modifies the data or not, results in a new version of the table. 

In the provided information, the merge operation is used to upsert data from a source table into a target Delta table. Even if no rows match the condition and no updates are made, this operation is still considered a transaction and will increment the version of the target table.

Sidhant07
New Contributor III
New Contributor III

Yes, this is the expected behavior. In Delta Lake, every operation, including MERGE, is atomic. This means that each operation is a transaction that can either succeed completely or fail; it cannot have partial success. Even if the MERGE operation doesn't result in any changes to the target table, it is still considered a transaction and thus increments the table version.When a MERGE operation is performed, Delta Lake performs several steps:1. It identifies the rows in the source data that match the condition specified in the MERGE statement.
2. It applies the update, delete, or insert actions to the matched and not-matched source and target rows.
3. It writes out the result as a new version of the target Delta table.Even if no rows are updated, deleted, or inserted, these steps are still performed, and the result is written out as a new version of the target Delta table. This is why the table version is incremented even if nothing is updated in the target table.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.