cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Delta Live Table - How to pass OPTION "ignoreChanges" using SQL?

Sparks
New Contributor III

I am running a Delta Live Pipeline that explodes JSON docs into small Delta Live Tables. The docs can receive multiple updates over the lifecycle of the transaction. I am curating the data via medallion architecture, when I run an API /update with

{"full_refresh":"true"}

it resets checkpoints and runs fine, when I try to perform INCREMENTAL I am getting the following error:

org.apache.spark.sql.streaming.StreamingQueryException: Query dlt_fulfillment_tickets [id = 7c256d93-6271-4013-9d5d-fe356c18511f, runId = 1aba276e-1118-43c1-b1fa-85e688bf523b] terminated with exception: Detected a data update (for example part-00000-ba0db042-39f9-450b-ad19-3f05afb52830-c000.snappy.parquet) in the source table at version 10. This is currently not supported. If you'd like to ignore updates, set the option 'ignoreChanges' to 'true'. If you would like the data update to be reflected, please restart this query with a fresh checkpoint directory.

Is there a way to set the above option via SQL? My entire pipeline is in SQL.

1 ACCEPTED SOLUTION

Accepted Solutions

Sparks
New Contributor III

Hi @Vidula Khanna​ - The response was not a solution for my issue, it was more an acknowledgement that there is/was a GAP in documentation as the error was not pointing customers to the correct path to solve this issue. Hopefully this has been take care of by Databricks.

I had to refactor some SQL code to find a workaround.

Thanks for the follow-up.

View solution in original post

4 REPLIES 4

Prabakar
Esteemed Contributor III
Esteemed Contributor III

As per the doc we have the scala code to do this. I don't find a SQL alternative for this. However, I have raised this with the product team to see if we could get a SQL code for the same. I shall let you know once I get a response from them.

Prabakar
Esteemed Contributor III
Esteemed Contributor III

@Danny Aguirre​ I had a discussion with the product team and they mentioned that streaming only supports processing append-only changes. If you expect updates to the input then you should use normal live tables. The error message is not appropriate for the issue and they will be fixing the error message to ensure customers are not pointed to the wrong side of the investigation.

Vidula
Honored Contributor

Hey there @Danny Aguirre​ 

Does @Prabakar Ammeappin​  response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?

We'd love to hear from you.

Thanks!

Sparks
New Contributor III

Hi @Vidula Khanna​ - The response was not a solution for my issue, it was more an acknowledgement that there is/was a GAP in documentation as the error was not pointing customers to the correct path to solve this issue. Hopefully this has been take care of by Databricks.

I had to refactor some SQL code to find a workaround.

Thanks for the follow-up.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.