How can I run a streaming query on a new table with tbl property: change data feed enabled?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-06-2022 03:35 AM
In Databricks on AWS, I am trying to run a streaming query (trigger=Once) with delta.enableChangeDataFeed=true in the table definition as instructed, but this always fails with :
ERROR: Some streams terminated before this command could finish!
com.databricks.sql.transaction.tahoe.DeltaAnalysisException: Error getting change data for range [0 , 0] as change data was not recorded for version [0]. If you\'ve enabled change data feed on this table, use `DESCRIBE HISTORY` to see when it was first enabled.
Otherwise, to start recording change data, use `ALTER TABLE table_name SET TBLPROPERTIES
(delta.enableChangeDataFeed=true)`.I tried:
- removing checkpoints for streaming query
- recreating the table
- adding "spark.databricks.delta.properties.defaults.enableChangeDataFeed": "true" to spark_conf of the compute
None of those fixed the issue. This is quite puzzling as this setting works for other tables.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-06-2022 06:44 AM
What is precisely recreating a table? Maybe delta files are in the old version, which is not supporting CDC. Please also share your code.
Please first just read CDC not as a stream but as SELECT * FROM table_changes(table)
My blog: https://databrickster.medium.com/
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-30-2023 12:03 PM
Hi @daniel e
Can you try running the select command on table changes from 0th version and see if you get output?
SELECT * FROM table_changes('tableName', 0)
Also, Please share the streaming query that you are running.