cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

How to Incorporate Historical Data in Delta Live Pipeline?

blakedwb
New Contributor III

Now that delta live pipeline is GA we are looking to convert our existing processes to leverage it. One thing that remains unclear is how to populate new delta live tables with historical data?

Currently we are looking to use CDC by leveraging create_target_table to apply_changes into a bronze and a silver layer to keep history going forward. When trying to merge into the create_target_table outside of the DLT pipeline I get an error saying it must be a delta table and not a view.

I have also attempted drop view and recreate as a managed delta table. I am able to populate this table with the historical data but cannot use it in the DLT pipeline.

The other option I am considering is having the DLT pipeline execute a different set of code that pulls from the delta tables once and then convert to the daily code afterwards.

We have ~150m rows in a delta table that we would like to incorporate into the DLT pipeline. How can we populate the DLT silver and bronze layer with historical data from a managed delta tables? I would like to avoid running the entire ETL process for all rows. Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions

blakedwb
New Contributor III

@Kaniz Fatma​ Hello, sorry for the delayed response. The guide does not answer how to incorporate existing delta tables that container historical data into a delta live pipeline. We ended up changing the source data to pull from the existing bronze table as a work around

View solution in original post

2 REPLIES 2

blakedwb
New Contributor III

@Kaniz Fatma​ Hello, sorry for the delayed response. The guide does not answer how to incorporate existing delta tables that container historical data into a delta live pipeline. We ended up changing the source data to pull from the existing bronze table as a work around

Thank you for your reply. Marking your response as best.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group