How to set up partitions on the streaming Delta Table?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-29-2022 01:34 AM
Let's assume that we have 3 streaming Delta Tables:
- Bronze
- Silver
- Gold
My aim is to add partitioning to Silver table (for example by Date).
So, as a result Gold table with throw an error that source table has been updated and I would need to set 'ignoreChanges' to 'true' option in the streaming. Then streaming will work but it will move all data from Silver (all files has been changed) to Gold resulting in duplicates.
My question is how to handle this problem in the best way?
Is it possible to manipulate streaming checkpoint somehow?
- Labels:
-
Delta table
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-29-2022 01:39 AM
is the change data feed functionality (of your silver table) an option, combined with merge in your gold table?
https://docs.microsoft.com/en-us/azure/databricks/delta/delta-change-data-feed
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-09-2022 06:50 AM
I never used CDC in Databricks but my ELT architecture is based on streaming and I don't want to change that.

