Merge, Update and deletes using deletion vectors
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-19-2025 11:17 AM
Hi,
When deletion vectors are enabled on a Delta table, is there a guarantee that MERGE, UPDATE, or DELETE operations will not rewrite unmodified data, but rather use deletion vectors to soft delete the original file?
For example, suppose the table currently consists of a single file, a.parquet. If I update a subset of rows in that file using the UPDATE operation, can I rely on the transaction log to record the following sequence of actions?
Remove a.parquet
Add a.parquet with a deletion vector indicating the removed rows
Add a new file (e.g. new.parquet) containing the updated rows
If there’s no such guarantee, could you explain scenarios where deletion vectors might be enabled but still not applied?
Thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-19-2025 08:43 PM
Hey @shanisolomonron ,
Yes, you are right. The above sequence of actions is always true for MERGE and UPDATE .For DELETE , you don't see any Add a new file (step 3)
And yes, if the table has the DV feature enabled, the writer/runtime supports DVs for that specific command, and you’re not in a special mode (append-only, UniForm, certain HMS cases), then your expected remove → add(with DV) → add(new files) pattern is exactly what happens.
Please let me know if you have any further questions.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-22-2025 08:05 AM
Thanks @K_Anudeep.
Could you clarify the condition in which deletion vectors might be enabled, but I might not see the above sequence of transactions, but rather see:
Remove a.parquet
Add a new file (e.g. new.parquet) containing the non-deleted + updated rows
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-22-2025 08:45 AM
Hey @shanisolomonron , that's a classic copy-on-write when DVs are disabled. When DV's are enabled when a parquet file is removed, the same parquet file will be added back with a DV.
In the above scenario, if a.parquet has a DV associated with it and has a REMOVE action , and then a new file new.parquet is added , it is an OPTIMIZE command .
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-22-2025 09:37 AM
Thanks for the clarification @K_Anudeep.
Sorry to circle back on this, but in your original answer you mentioned there are scenarios where a Delta table can have deletion vectors enabled yet not actually use them—for example, when the table is append-only, uses UniForm, or in certain HMS situations.
Could you clarify the specific conditions where a Delta table has deletion vectors enabled, the engine is capable of writing them, but they still aren’t applied?