cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Incremental updates in Delta Live Tables

morganmazouchi
Databricks Employee
Databricks Employee

What happens if we change the logic for the delta live tables and we do an incremental update. Does the table get reset (refresh) automatically or would it only apply the logic to new incoming data? would we have to trigger a reset in this case?

1 ACCEPTED SOLUTION

Accepted Solutions

morganmazouchi
Databricks Employee
Databricks Employee

Here is my finding on when to refresh (reset) the table:

If it is a complete table all the changes would be apply automatically. 

If the table is incremental table, you need to do a manually reset (full refresh).

View solution in original post

7 REPLIES 7

-werners-
Esteemed Contributor III

I doubt the table gets reset, as that would mean you could never change anything in a current setup.

Delta live was created to make life easier, so my guess is that it is new data only.

Hubert-Dudek
Esteemed Contributor III

Delta is transactional so nothing will be reset, it can be deleted but you can always back to past using time capsule travel 🙂

One thing to which maybe you are referring is partitioning, when you overwrite data it will overwrite all partitions but you can use dynamic overwrite so you will only overwrite partitions which have new data:

spark.conf.set("spark.sql.sources.partitionOverwriteMode", "dynamic")

morganmazouchi
Databricks Employee
Databricks Employee

Here is my finding on when to refresh (reset) the table:

If it is a complete table all the changes would be apply automatically. 

If the table is incremental table, you need to do a manually reset (full refresh).

Hi @Mojgan Mazouchi​ ,

According to the docs "Tables can be incremental or complete. Incremental tables support updates based on continually arriving data without having to recompute the entire table. A complete table is entirely recomputed with each update."

Docs here https://docs.databricks.com/data-engineering/delta-live-tables/delta-live-tables-user-guide.html#dat...

That implies that if there is any logic change for complete tables since it is recomputed on every update it is safe to not refresh the pipeline, where as with incremental tables if there is any change to recompute the changes we should refresh the pipeline, right?

According to the docs, yes.

Thanks Jose!

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group