cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Dealing with updates to a delta table being used as a streaming source

Confused
New Contributor III

Hi All

I have a requirement to perform updates on a delta table that is the source for a streaming query.

I would like to be able to update the table and have the stream continue to work while also not ending up with duplicates.

From my research it seems that the ignoreDeletes option will not work as I am not going to be updating/deleting based on the partition column. The ignoreChanges option also looks unsuitable as it will generate duplicates of not only the rows I update, but also any other rows in the same files.

Does anyone have any suggestions/procedures they've used for similar in the past?

Thanks

1 ACCEPTED SOLUTION

Accepted Solutions

Manjunath
Databricks Employee
Databricks Employee

Hi @Leszek​ 

For your case ignoreChanges option will work, but you need to handle duplicates from your streaming app while writing to sink. If your sink is Delta then you can go with Delta Streaming Merge.

https://docs.databricks.com/_static/notebooks/merge-in-streaming.html

View solution in original post

3 REPLIES 3

Leszek
Contributor

Maybe merging data from updated delta into next streaming delta will work?

https://www.youtube.com/watch?v=2Iy5S0Hf4XM

Manjunath
Databricks Employee
Databricks Employee

Hi @Leszek​ 

For your case ignoreChanges option will work, but you need to handle duplicates from your streaming app while writing to sink. If your sink is Delta then you can go with Delta Streaming Merge.

https://docs.databricks.com/_static/notebooks/merge-in-streaming.html

Anonymous
Not applicable

Hey @Mathew Walters​ 

Hope you are doing great.

Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution? Else please let us know if you need more help. 

We'd love to hear from you.

Thanks!

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group