- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-16-2021 09:38 AM
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-18-2021 01:36 PM
Delta implements MERGE by physically rewriting existing files.
It is implemented in two steps.
- Perform an inner join between the target table and source table to select all files that have matches.
- Perform an outer join between the selected files in the target and source tables and write out the updated/deleted/inserted data.
Here is an article that explain the DML internals of delta - https://databricks.com/blog/2020/09/29/diving-into-delta-lake-dml-internals-update-delete-merge.html
The old files that are not referenced anymore would get garbage collected eventually when you run VACUUM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-18-2021 01:36 PM
Delta implements MERGE by physically rewriting existing files.
It is implemented in two steps.
- Perform an inner join between the target table and source table to select all files that have matches.
- Perform an outer join between the selected files in the target and source tables and write out the updated/deleted/inserted data.
Here is an article that explain the DML internals of delta - https://databricks.com/blog/2020/09/29/diving-into-delta-lake-dml-internals-update-delete-merge.html
The old files that are not referenced anymore would get garbage collected eventually when you run VACUUM

