Tuesday
Hi!
I am creating an Asset Bundle, which also includes my streaming Delta Live Table Pipelines. I want to move these DLT pipelines to the Asset Bundle, without having to run my DLT streaming Pipeline on all historical files (this takes a lot of compute and time). Is there a way to migrate an existing DLT Pipeline to Asset Bundles?
Tuesday - last edited Tuesday
So maybe try to use bind command? This command allows to link bundle-defined jobs and pipelines to existing jobs and pipelines in the Databricks workspace so that they become managed by Databricks Asset Bundles
https://docs.gcp.databricks.com/en/dev-tools/cli/bundle-commands.html#bind-bundle-resourcesbundle co...
databricks bundle deployment bind [resource-key] [resource-id]
Tuesday
Yes, you can migrate an existing Delta Live Table (DLT) pipeline to an Asset Bundle without having to reprocess all historical files. Here are the steps to achieve this:
Create a Databricks Asset Bundle: Use the Databricks CLI to initialize a new bundle. This will create a databricks.yml
file in the root of your project, which will be used to define your Databricks resources, including your DLT pipelines.
Define the DLT Pipeline in the Bundle: In the databricks.yml
file, you will need to define your DLT pipeline. This involves specifying the pipeline's configuration, such as the path to the notebook or script that defines the pipeline logic.
Deploy the Bundle: Use the Databricks CLI to deploy the bundle to your target environment. This will create the necessary resources in your Databricks workspace based on the definitions in the databricks.yml
file.
Run the Pipeline: Once the bundle is deployed, you can run the DLT pipeline from the Databricks extension panel or using the CLI. This will start the pipeline without reprocessing all historical files, as the pipeline will continue from its last processed state.
Tuesday
I have done these steps, but my DLT still took a long time to process. However, the path to the notebook with my pipeline logic has changed because I am deploying it as a bundle. Is this a problem? Also, the name of the pipeline changed because of a prefix I added in the Asset Bundle.
Tuesday - last edited Tuesday
Hi @Isa1 ,
f you have existing pipelines that were created using the Databricks user interface or API that you want to move to bundles, you must define them in a bundleโs configuration files. Databricks recommends that you first create a bundle using the steps below and then validate whether the bundle works. You can then add additional definitions, notebooks, and other sources to the bundle.
You can follow official documentation entry. Just repeat steps:
Develop Delta Live Tables pipelines with Databricks Asset Bundles | Databricks on AWS
Tuesday
When you change the path to the notebook or the name of the pipeline in your Delta Live Table (DLT) pipeline, it can indeed cause issues. Specifically, changing the path to the notebook or the name of the pipeline can lead to the recreation of the pipeline.
Tuesday - last edited Tuesday
So maybe try to use bind command? This command allows to link bundle-defined jobs and pipelines to existing jobs and pipelines in the Databricks workspace so that they become managed by Databricks Asset Bundles
https://docs.gcp.databricks.com/en/dev-tools/cli/bundle-commands.html#bind-bundle-resourcesbundle co...
databricks bundle deployment bind [resource-key] [resource-id]
Tuesday
And to add one thing, in Delta Live Tables checkpoints are stored under the storage location specified in the DLT settings. Each table gets a dedicated directory under storage_location/checkpoints/<dlt_table_name. So if you would like to avoid running your pipeline from the start you need to use bind command, because otherwise new pipeline name will create new checkpoint directory.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโt want to miss the chance to attend and share knowledge.
If there isnโt a group near you, start one and help create a community that brings people together.
Request a New Group