- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-05-2025 09:44 AM
Row tracking gives each Delta row a stable internal ID, so Delta can track inserts/updates/deletes across table versions—even when files are rewritten or compacted.
Suppose we have a Delta table:
| 1 | A |
| 2 | B |
When row tracking is enabled, Delta Lake stores an internal row ID (not visible to users):
| 1 | A | 001 |
| 2 | B | 002 |
Think of row_id as a stable fingerprint for that row.
Row tracking ensures:
Correct incremental pipelines- Even after compacting files, you still get accurate row-level changes.
Accurate CDF outputs- Pre-image and post-image rows are correctly paired.
Safe MERGE, UPDATE, DELETE- Delta knows exactly which rows were modified.
Better performance- Delta avoids expensive file-level scans because it knows exactly which rows changed.