cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

How to refresh a single table in Delta Live Tables?

tinai_long
New Contributor III

Suppose I have a Delta Live Tables framework with 2 tables:

  • Table 1 ingests from a json source,
  • Table 2 reads from Table 1 and runs some transformation.

In other words, the data flow is json source -> Table 1 -> Table 2.

Now if I find some bugs in the transformation Table 1 -> Table 2, how can I re-run only the transformation Table 1 -> Table 2 and leave Table 1 intact?

If I use Full Refresh, it would refresh Table 1 & rerun the json ingestion as well...

1 ACCEPTED SOLUTION

Accepted Solutions

Dooley
Valued Contributor II

@Long Tran​ , The best way to achieve this would be a work around. If you could sacrifice a row of code in your Table 1 or add a row of Nulls into Table 1 without causing problems for yourself further down your pipeline, I suggest you try this Retain manual deletes or updates - "You can manually delete or update the record from raw_user_table and do a refresh operation to recompute the downstream tables."

However, I want to note that if you are ingesting from a source that has no new data to ingest, the full refresh probably wont re-ingest the same data causing duplication. Try it out on a subsection of your data.

View solution in original post

10 REPLIES 10

Dooley
Valued Contributor II

@Long Tran​ , The best way to achieve this would be a work around. If you could sacrifice a row of code in your Table 1 or add a row of Nulls into Table 1 without causing problems for yourself further down your pipeline, I suggest you try this Retain manual deletes or updates - "You can manually delete or update the record from raw_user_table and do a refresh operation to recompute the downstream tables."

However, I want to note that if you are ingesting from a source that has no new data to ingest, the full refresh probably wont re-ingest the same data causing duplication. Try it out on a subsection of your data.

tinai_long
New Contributor III

Thank you so much!

sorry for the delayed response, User16460565755155528764's answer is very helpful.

Anonymous
Not applicable

Hey @Long Tran​ 

Does @Sara Dooley​  response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly? Else please let us know if you need more help. 

We'd love to hear from you.

Cheers!

tinai_long
New Contributor III

Hi, sorry I am new here - how do I mark the answer as resolved? Thanks a lot.

Anonymous
Not applicable

Hi @Long Tran​ 

Thank you so much for getting back to us. It's really great of you to mark the answer as best. 

We really appreciate your time.

Wish you a great Databricks journey ahead!

Felipe
New Contributor II

An update to anyone finding this thread nowadays.

This is possible using the reset.allowed property as documented here: https://docs.databricks.com/data-engineering/delta-live-tables/delta-live-tables-cookbook.html#retai...

cpayne_vax
New Contributor III

I want to tag onto this thread because I have the same need to refresh only a single table within a larger DLT pipeline. Unfortunately it seems the links in the accepted answer and in Felipe's follow up no longer contain the correct information. Is there a proper way to do this now, in 2024?

cpayne_vax
New Contributor III

Answering my own question: nowadays (February 2024) this can all be done via the UI.

When viewing your DLT pipeline there is a "Select tables for refresh" button in the header. If you click this, you can select individual tables, and then in the bottom right corner there are options to "Full refresh selection" or "Refresh selection." Select "Full" in order to start your table over clean.

Just wondering if there is a way to do this through code / not using the UI. Its now Dec 2024. We would like to handle this in testing using our workflows

Thanks 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group