- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-13-2025 07:30 PM
Hi Ramu1821,
How are you doing today? as per my understanding, for your use case of keeping only the last 24 hours of data in sync—including inserts, updates, and deletes—within a Delta Live Tables (DLT) pipeline, you're right that create_streaming_live_table(...).create_auto_cdc() and apply_as_deletes = true help capture delete events only when they arrive, but they don’t clean up older data already in the table. A common approach here is to use a two-step logic: first, capture CDC events using apply_changes (or your custom logic) and write to a staging table; then, in your final “latest” table, filter only records where the timestamp is within the last 24 hours. To handle deletes for records that didn’t get new change events, you can run a scheduled batch process (or add logic in a DLT table with triggered mode) that compares timestamps and removes anything older than 24 hours. While DLT doesn’t offer full automatic “time-windowed retention” with delete tracking out-of-the-box, combining CDC, timestamp-based filtering, and a scheduled cleanup can get you close to your goal. Let me know if you'd like a code example to get started!
Regards,
Brahma