cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Copy CDF enabled delta table from one location to another by retaining history

RKNutalapati
Valued Contributor

I am currently doing some use case testing. I have to CLONE delta table with CDF enabled to a different S3 bucket. Deep clone doesn't meet the requirement. So I tried to copy the files using dbutils.fs.cp, it is copying all the versions but the timestamp is getting changed.

Is there any work around to retain the timestamp as well while copying / migrating delta table.

Appreciate your valuable suggestions

1 ACCEPTED SOLUTION

Accepted Solutions

@Rama Krishna N​ , According to the docs "A cloned table has an independent history from its source table.." are you trying to do use time travel? maybe the clone uses cases that we have in the docs might help to explain better https://docs.databricks.com/delta/delta-utility.html#clone-use-cases

View solution in original post

7 REPLIES 7

Kaniz_Fatma
Community Manager
Community Manager

Hi @Rama Krishna N​ ! My name is Kaniz, and I'm the technical moderator here. Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question first. Or else I will get back to you soon. Thanks.

Kaniz_Fatma
Community Manager
Community Manager

Hi @Rama Krishna N​ ,

Because you're on Azure, you can use Azure Data Factory's Data Copy Tool as it's described in the documentation - delta tables are just files in the container, and this tool can copy data, and potentially it would be cheaper than using Databricks cluster to do the copying.

Hi @Kaniz Fatma​ , Thanks for your response, I am not using Azure. I am working on DataBricks AWS.

Thank you for your response @Rama Krishna N​ . Got your point better now.

Hi @Rama Krishna N​ ,

Just a friendly follow-up. Do you still need help? If you do, could you provide more details on your issue?

What type of cloning did you do? Shadow or deep clone?

A cloned table has an independent history from its source table. Time travel queries on a cloned table will not work with the same inputs as they work on its source table. You can find more information here https://docs.databricks.com/delta/delta-utility.html#clone-delta-table

Hi @Jose Gonzalez​  : Thanks for the follow-up. I am using Deep clone and was expecting the utility maintains the same historical timestamps. But that is not happening. Is there any good reason for independent history?

@Rama Krishna N​ , According to the docs "A cloned table has an independent history from its source table.." are you trying to do use time travel? maybe the clone uses cases that we have in the docs might help to explain better https://docs.databricks.com/delta/delta-utility.html#clone-use-cases

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group