06-01-2022 01:56 PM
We have a delta streaming source in our delta live table pipelines that may have data deleted from time to time.
The error message is pretty self explanatory:
...from streaming source at version 191. This is currently not supported. If you'd like to ignore deletes, set the option 'ignoreDeletes' to 'true'.
What's not clear is how to set this option. This is what we have now but it's not producing the desired results. The desired result being new data is read and deletes are ignored.
SET pipelines.ignoreDeletes = true;
CREATE OR REFRESH STREAMING LIVE TABLE...
How should this option be set in a delta live table?
06-02-2022 11:00 AM
Hi - Thanks for the response. Does your suggestion work with Delta live tables when you try it? This seems to produce the same error message when I use the code below:
@dlt.table(
...
})
def table_fnc():
return spark.readStream.format("delta").option("ignoreDeletes", "true").table("tablename")
I'm not worried about duplicates. I just want to stream out the tables current state and append it to a sink in my DLT pipeline. As far as I know, DLT can't just append data from a source unless it's streamed in...
06-10-2022 01:10 PM
I haven't heard back, but the response above was copy and pasted from here: Table streaming reads and writes | Databricks on AWS
We decided to just move these tables to a true structured stream. We hope that DLT can support simple appends later on.
02-02-2023 11:48 AM
@Kaniz Fatma - Has Databricks found a way to prune unwanted records from a source without requiring the entire sink table be recalculated with DLT?
07-31-2022 08:00 PM
@Kaniz Fatma Hi Kaniz, can we please circle around to this? Like @Zachary Higgins , I am unsure how to set the ignoreDeletes or ignoreChanges spark.sql configuration for my Delta Live Table Pipeline defined in SQL.
Thanks
11-11-2022 04:34 AM
Databricks, please provide an answer to this. It seems like there is no documentation on how delta live tables support table updates. The ignoreChanges is bound to spark.readstream method which is not made to dlt.read_stream
12-18-2022 01:31 PM
I'd am looking at this as well and would like to understand my options here.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group