cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Merge, Update and deletes using deletion vectors

shanisolomonron
New Contributor

Hi,

When deletion vectors are enabled on a Delta table, is there a guarantee that MERGE, UPDATE, or DELETE operations will not rewrite unmodified data, but rather use deletion vectors to soft delete the original file?

For example, suppose the table currently consists of a single file, a.parquet. If I update a subset of rows in that file using the UPDATE operation, can I rely on the transaction log to record the following sequence of actions?

  1. Remove a.parquet

  2. Add a.parquet with a deletion vector indicating the removed rows

  3. Add a new file (e.g. new.parquet) containing the updated rows

If there’s no such guarantee, could you explain scenarios where deletion vectors might be enabled but still not applied?

Thanks!

1 REPLY 1

K_Anudeep
Databricks Employee
Databricks Employee

Hey @shanisolomonron ,

Yes, you are right. The above sequence of actions is always true for MERGE and UPDATE .For DELETE , you don't see any Add a new file (step 3)

And yes, if the table has the DV feature enabled, the writer/runtime supports DVs for that specific command, and you’re not in a special mode (append-only, UniForm, certain HMS cases), then your expected remove → add(with DV) → add(new files) pattern is exactly what happens.

Please let me know if you have any further questions.

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now