โ03-20-2023 10:56 PM
Hi,
I am trying to understand how to take backup of databrick delta tables/workspace and how to restore in case if any failure.
Or suggest me any alternative solution to revert back if data is corrupted.
Regards,
Sanjay
โ03-21-2023 04:02 AM
Thank you. Delta Lake Time Travel is helpful. Do you mean "Data Explorer" where I can see history for each table. That way I need to retrieve each table separately. Is there a way to retrieve full workspace in one go in case of any failure
โ03-21-2023 01:13 AM
Hi @Sanjay Jainโ ,
Delta table is ACID compliant and can store the previous versions of your data depending on the retention period you set. So if you by any chance overwritten the table with a messy data or let's say dropped your table/data mistakenly, you can use the time travel capabilities of delta lake and go back to the previous versions (number of days) as per your retention set.
Delta Lake Time travel -> Please refer this link for some examples of the time travel.
The workspace also stores the previous versions in the UI itself. In the Databricks UI, you can toggle the version history. Please refer this - https://docs.databricks.com/workspace/index.html#
Kindly let me know if this helps.
โ03-21-2023 04:02 AM
Thank you. Delta Lake Time Travel is helpful. Do you mean "Data Explorer" where I can see history for each table. That way I need to retrieve each table separately. Is there a way to retrieve full workspace in one go in case of any failure
โ03-21-2023 05:06 AM
@Sanjay Jainโ In the data explorer, you can just see the object level information. If you need the version history of each delta lake table, the below command would help
DESCRIBE HISTORY table_name
If you need to see each notebook level versioning, you can look in the UI itself. This will help to see the changes in each version of your notebook and revert back to any previous versions. But if your requirement is on a complete workspace level, then using Version Control tools like Git or Bitbucket should help.
Please check the below links for some more idea on this.
Azure Databricks & Version Management | by Prosenjit Chakraborty | Medium
Version Control in Databricks Notebook (bigdataprogrammers.com)
btw @Sanjay Jainโ you selected your response as the best answer. Hope it's a mistake. If you feel my answer helped, please select mine as best answer.
โ03-21-2023 04:13 AM
Hi @Sanjay Jainโ ,
Here are some of the ways
Deep Clone: https://www.databricks.com/wp-content/uploads/notebooks/using-deep-clone-disaster-recovery-delta-lak...
Repos for notebooks and code: https://docs.databricks.com/repos/index.html
https://docs.databricks.com/administration-guide/disaster-recovery.html this covers further details on DR.
Thanks & Regards,
Nandini
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโt want to miss the chance to attend and share knowledge.
If there isnโt a group near you, start one and help create a community that brings people together.
Request a New Group