โ08-08-2022 08:14 PM
I am running a Delta Live Pipeline that explodes JSON docs into small Delta Live Tables. The docs can receive multiple updates over the lifecycle of the transaction. I am curating the data via medallion architecture, when I run an API /update with
{"full_refresh":"true"}it resets checkpoints and runs fine, when I try to perform INCREMENTAL I am getting the following error:
org.apache.spark.sql.streaming.StreamingQueryException: Query dlt_fulfillment_tickets [id = 7c256d93-6271-4013-9d5d-fe356c18511f, runId = 1aba276e-1118-43c1-b1fa-85e688bf523b] terminated with exception: Detected a data update (for example part-00000-ba0db042-39f9-450b-ad19-3f05afb52830-c000.snappy.parquet) in the source table at version 10. This is currently not supported. If you'd like to ignore updates, set the option 'ignoreChanges' to 'true'. If you would like the data update to be reflected, please restart this query with a fresh checkpoint directory.
Is there a way to set the above option via SQL? My entire pipeline is in SQL.
โ09-07-2022 06:10 AM
Hi @Vidula Khannaโ - The response was not a solution for my issue, it was more an acknowledgement that there is/was a GAP in documentation as the error was not pointing customers to the correct path to solve this issue. Hopefully this has been take care of by Databricks.
I had to refactor some SQL code to find a workaround.
Thanks for the follow-up.
โ08-09-2022 01:56 AM
As per the doc we have the scala code to do this. I don't find a SQL alternative for this. However, I have raised this with the product team to see if we could get a SQL code for the same. I shall let you know once I get a response from them.
โ08-09-2022 07:27 AM
@Danny Aguirreโ I had a discussion with the product team and they mentioned that streaming only supports processing append-only changes. If you expect updates to the input then you should use normal live tables. The error message is not appropriate for the issue and they will be fixing the error message to ensure customers are not pointed to the wrong side of the investigation.
โ09-07-2022 05:58 AM
Hey there @Danny Aguirreโ
Does @Prabakar Ammeappinโ response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?
We'd love to hear from you.
Thanks!
โ09-07-2022 06:10 AM
Hi @Vidula Khannaโ - The response was not a solution for my issue, it was more an acknowledgement that there is/was a GAP in documentation as the error was not pointing customers to the correct path to solve this issue. Hopefully this has been take care of by Databricks.
I had to refactor some SQL code to find a workaround.
Thanks for the follow-up.
Passionate about hosting events and connecting people? Help us grow a vibrant local communityโsign up today to get started!
Sign Up Now