Read Data from Updating Streaming Table
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-16-2024 11:52 AM
We are working on a project with the following setup:
- The Bronze table is append-only.
- The Silver table uses "APPLY CHANGES INTO" from the Bronze table and implements SCD Type 1, meaning records can be updated at any time.
We are trying to use the Silver table as a source for the Gold table in a streaming pipeline. However, when the pipeline runs for the second time and records in the Silver table are updated, we encounter the following error:
Flow 'gold_table' has FAILED fatally. An error occurred because we detected an update or delete to one or more rows in the source table. Streaming tables may only use append-only streaming sources. If you expect to delete or update rows to the source table in the future, please convert table gold_table to a materialized view instead of a streaming table. If you are fine to skip changes, set the option 'skipChangeCommits' to 'true'. If you only have a one-time change in the source table and will only have append-only commits from now on, you can perform a full refresh to table gold_table to resolve the issue. A Full Refresh will attempt to clear all data from table gold_table and then load all data from the streaming source. The non-append change can be found at version 265. Operation: MERGE Source table name: silver_table
We cannot use skipChangeCommits as we want to track this updates and move those to Gold table.
How to resolve this issue with given configuration and architecture?
Thank you in advance.
1 REPLY 1
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-16-2024 11:58 AM
Hi @smit_tw,
you can consider the following approach:
- Materialized View: Convert the Gold table to a materialized view instead of a streaming table. This approach will allow you to handle updates and deletes in the Silver table without causing errors in the Gold table. Materialized views do not have the append-only restriction that streaming tables do.
- Full Refresh: If you expect only a one-time change in the Silver table and subsequent operations will be append-only, you can perform a full refresh of the Gold table. This will clear all data from the Gold table and reload all data from the Silver table, resolving the issue caused by the non-append change.
- Change Data Feed (CDF): Utilize the Delta Lake Change Data Feed (CDF) feature to track changes in the Silver table and apply those changes to the Gold table. This approach allows you to capture updates and deletes and propagate them to the Gold table efficiently

