cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

How to backup databrick delta tables or workspace

sanjay
Valued Contributor II

Hi,

I am trying to understand how to take backup of databrick delta tables/workspace and how to restore in case if any failure.

Or suggest me any alternative solution to revert back if data is corrupted.

Regards,

Sanjay

1 ACCEPTED SOLUTION

Accepted Solutions

sanjay
Valued Contributor II

Thank you. Delta Lake Time Travel is helpful. Do you mean "Data Explorer" where I can see history for each table. That way I need to retrieve each table separately. Is there a way to retrieve full workspace in one go in case of any failure

View solution in original post

4 REPLIES 4

pvignesh92
Honored Contributor

Hi @Sanjay Jainโ€‹ ,

Delta table is ACID compliant and can store the previous versions of your data depending on the retention period you set. So if you by any chance overwritten the table with a messy data or let's say dropped your table/data mistakenly, you can use the time travel capabilities of delta lake and go back to the previous versions (number of days) as per your retention set.

Delta Lake Time travel -> Please refer this link for some examples of the time travel.

The workspace also stores the previous versions in the UI itself. In the Databricks UI, you can toggle the version history. Please refer this - https://docs.databricks.com/workspace/index.html#

Kindly let me know if this helps.

sanjay
Valued Contributor II

Thank you. Delta Lake Time Travel is helpful. Do you mean "Data Explorer" where I can see history for each table. That way I need to retrieve each table separately. Is there a way to retrieve full workspace in one go in case of any failure

pvignesh92
Honored Contributor

@Sanjay Jainโ€‹  In the data explorer, you can just see the object level information. If you need the version history of each delta lake table, the below command would help

DESCRIBE HISTORY table_name

If you need to see each notebook level versioning, you can look in the UI itself. This will help to see the changes in each version of your notebook and revert back to any previous versions. But if your requirement is on a complete workspace level, then using Version Control tools like Git or Bitbucket should help.

Please check the below links for some more idea on this.

Azure Databricks & Version Management | by Prosenjit Chakraborty | Medium

Version Control in Databricks Notebook (bigdataprogrammers.com)

btw @Sanjay Jainโ€‹ you selected your response as the best answer. Hope it's a mistake. If you feel my answer helped, please select mine as best answer.

NandiniN
Databricks Employee
Databricks Employee

Hi @Sanjay Jainโ€‹ ,

Here are some of the ways

Deep Clone: https://www.databricks.com/wp-content/uploads/notebooks/using-deep-clone-disaster-recovery-delta-lak...

Repos for notebooks and code: https://docs.databricks.com/repos/index.html

https://docs.databricks.com/administration-guide/disaster-recovery.html this covers further details on DR.

Thanks & Regards,

Nandini

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group