cancel
Showing results for 
Search instead for 
Did you mean: 

How to backup databrick delta tables or workspace

sanjay
Valued Contributor

Hi,

I am trying to understand how to take backup of databrick delta tables/workspace and how to restore in case if any failure.

Or suggest me any alternative solution to revert back if data is corrupted.

Regards,

Sanjay

1 ACCEPTED SOLUTION

Accepted Solutions

sanjay
Valued Contributor

Thank you. Delta Lake Time Travel is helpful. Do you mean "Data Explorer" where I can see history for each table. That way I need to retrieve each table separately. Is there a way to retrieve full workspace in one go in case of any failure

View solution in original post

4 REPLIES 4

pvignesh92
Honored Contributor

Hi @Sanjay Jain​ ,

Delta table is ACID compliant and can store the previous versions of your data depending on the retention period you set. So if you by any chance overwritten the table with a messy data or let's say dropped your table/data mistakenly, you can use the time travel capabilities of delta lake and go back to the previous versions (number of days) as per your retention set.

Delta Lake Time travel -> Please refer this link for some examples of the time travel.

The workspace also stores the previous versions in the UI itself. In the Databricks UI, you can toggle the version history. Please refer this - https://docs.databricks.com/workspace/index.html#

Kindly let me know if this helps.

sanjay
Valued Contributor

Thank you. Delta Lake Time Travel is helpful. Do you mean "Data Explorer" where I can see history for each table. That way I need to retrieve each table separately. Is there a way to retrieve full workspace in one go in case of any failure

pvignesh92
Honored Contributor

@Sanjay Jain​  In the data explorer, you can just see the object level information. If you need the version history of each delta lake table, the below command would help

DESCRIBE HISTORY table_name

If you need to see each notebook level versioning, you can look in the UI itself. This will help to see the changes in each version of your notebook and revert back to any previous versions. But if your requirement is on a complete workspace level, then using Version Control tools like Git or Bitbucket should help.

Please check the below links for some more idea on this.

Azure Databricks & Version Management | by Prosenjit Chakraborty | Medium

Version Control in Databricks Notebook (bigdataprogrammers.com)

btw @Sanjay Jain​ you selected your response as the best answer. Hope it's a mistake. If you feel my answer helped, please select mine as best answer.

NandiniN
Valued Contributor II
Valued Contributor II

Hi @Sanjay Jain​ ,

Here are some of the ways

Deep Clone: https://www.databricks.com/wp-content/uploads/notebooks/using-deep-clone-disaster-recovery-delta-lak...

Repos for notebooks and code: https://docs.databricks.com/repos/index.html

https://docs.databricks.com/administration-guide/disaster-recovery.html this covers further details on DR.

Thanks & Regards,

Nandini

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.