06-01-2022 11:23 AM
Now that delta live pipeline is GA we are looking to convert our existing processes to leverage it. One thing that remains unclear is how to populate new delta live tables with historical data?
Currently we are looking to use CDC by leveraging create_target_table to apply_changes into a bronze and a silver layer to keep history going forward. When trying to merge into the create_target_table outside of the DLT pipeline I get an error saying it must be a delta table and not a view.
I have also attempted drop view and recreate as a managed delta table. I am able to populate this table with the historical data but cannot use it in the DLT pipeline.
The other option I am considering is having the DLT pipeline execute a different set of code that pulls from the delta tables once and then convert to the daily code afterwards.
We have ~150m rows in a delta table that we would like to incorporate into the DLT pipeline. How can we populate the DLT silver and bronze layer with historical data from a managed delta tables? I would like to avoid running the entire ETL process for all rows. Thanks!
06-15-2022 06:53 AM
@Kaniz Fatma Hello, sorry for the delayed response. The guide does not answer how to incorporate existing delta tables that container historical data into a delta live pipeline. We ended up changing the source data to pull from the existing bronze table as a work around
06-02-2022 04:23 AM
Hi @Blake Brown, This guide will demonstrate how you can leverage Change Data Capture in Delta Live Tables pipelines to identify new records and capture changes made to the data set in your data lake.
Delta Live Tables pipelines enable you to develop scalable, reliable, and low latency data pipelines while performing Change Data Capture in your data lake with the minimum required computation resources and seamless out-of-order data handling.
Note: We recommend following the Getting Started with Delta Live Tables which explains creating scalable and reliable pipelines using Delta Live Tables (DLT) and its declarative ETL definitions.
06-09-2022 12:33 AM
Hi @Blake Brown, We haven’t heard from you on the last response from me, and I was checking back to see if you have a resolution yet. If you have any solution, please do share that with the community as it can be helpful to others. Otherwise, we will respond with more details and try to help.
06-15-2022 06:53 AM
@Kaniz Fatma Hello, sorry for the delayed response. The guide does not answer how to incorporate existing delta tables that container historical data into a delta live pipeline. We ended up changing the source data to pull from the existing bronze table as a work around
07-28-2022 05:43 PM
Thank you for your reply. Marking your response as best.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group