01-21-2023 11:36 AM
I am looking for a way to copy large managed Delta table like 4TB from one environment (QA) to other(Prod) . QA and Prod are in different subscription and in different region. I understand Databricks provides a way to clone table. But I am not sure if cloning can work across the subscriptions. Yes, there is network connectivity between QA and prod in case files need to be copied from lower to higher environment. I am sure I am not the first person trying to copy tables across the environment. Can you share how you performed such copy/migration ?
01-22-2023 04:37 AM
USE DEEP CLONE
CREATE TABLE delta.`/data/target/` CLONE delta.`/data/source/` -- Create a deep clone of /data/source at /data/target
ref link: https://docs.databricks.com/optimizations/clone.html
01-22-2023 07:41 PM
Does it support cloning across the subscription ? If so can you share an example?
01-22-2023 07:52 AM
I don't know if it would be a ideal option, but please read more Unity Catalog and delta sharing. DEEP CLONE souds good.
01-22-2023 07:43 PM
We are not using unity catalog. This is still based on Hive catalog
01-23-2023 01:34 AM
@Ratnadeep Bose
The best way would be to create a storage that will be used to copy the data between two envs.
Thanks to that you've got the same data on both subscriptions.
01-24-2023 12:29 PM
Just to be clear we are using managed delta table not external table. I am not sure if above solution will still work. Thanks very much for your feedback
01-24-2023 10:32 PM
@Ratnadeep Bose
That's why I've mentioned creating external table as a table that will be used for data copy between two environments. It should be a copy of source table but with the location on the storage.
01-23-2023 05:24 AM
I would use a data factory to copy 4TB files as it has gigantic throughput. After completing a copy of everything, I would register as a table in the new metastore.
01-24-2023 12:33 PM
Thought about using ADF. Since we are using managed Delta table, I am not sure how you can register based on external data. Any idea?
Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections.
Click here to register and join today!
Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.