cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Is it possible to migrate data from one DLT pipeline to another?

MarkD
New Contributor II

Hi,

We have a DLT pipeline that has been running for a while with a Hive Metastore target that has stored billions of records. We'd like to move the data to a Unity Catalog. The documentation says "Existing pipelines that use the Hive metastore cannot be upgraded to use Unity Catalog. To migrate an existing pipeline that writes to Hive metastore, you must create a new pipeline and re-ingest data from the data source(s)." The problem is that the original data sources no longer exist, so we can't just start a new pipeline and get all the data. Is there any way to migrate/copy the data from the existing pipeline to a new one as the starting point for that pipeline, so it doesn't have to start from the beginning?

1 REPLY 1

Yeshwanth
Databricks Employee
Databricks Employee

@MarkD good day!

I'm sorry, but according to the description, existing pipelines using the Hive metastore cannot be upgraded to use Unity Catalog. To migrate an existing pipeline that writes to Hive metastore, you must create a new pipeline and re-ingest data from the data source(s). If the original data sources are no longer available, there is no documented method to migrate or copy the data from the existing pipeline to a new one.

The documentation suggests that the data must be re-ingested from the original data sources when creating a new pipeline, and there is no mention of a method to use data from an existing pipeline as the starting point for a new pipeline.

Doc: https://docs.databricks.com/en/delta-live-tables/unity-catalog.html#limitations

Kind regards,

Yesh