03-20-2023 10:56 PM
Hi,
I am trying to understand how to take backup of databrick delta tables/workspace and how to restore in case if any failure.
Or suggest me any alternative solution to revert back if data is corrupted.
Regards,
Sanjay
03-21-2023 04:02 AM
Thank you. Delta Lake Time Travel is helpful. Do you mean "Data Explorer" where I can see history for each table. That way I need to retrieve each table separately. Is there a way to retrieve full workspace in one go in case of any failure
03-21-2023 01:13 AM
Hi @Sanjay Jain ,
Delta table is ACID compliant and can store the previous versions of your data depending on the retention period you set. So if you by any chance overwritten the table with a messy data or let's say dropped your table/data mistakenly, you can use the time travel capabilities of delta lake and go back to the previous versions (number of days) as per your retention set.
Delta Lake Time travel -> Please refer this link for some examples of the time travel.
The workspace also stores the previous versions in the UI itself. In the Databricks UI, you can toggle the version history. Please refer this - https://docs.databricks.com/workspace/index.html#
Kindly let me know if this helps.
03-21-2023 04:02 AM
Thank you. Delta Lake Time Travel is helpful. Do you mean "Data Explorer" where I can see history for each table. That way I need to retrieve each table separately. Is there a way to retrieve full workspace in one go in case of any failure
03-21-2023 05:06 AM
@Sanjay Jain In the data explorer, you can just see the object level information. If you need the version history of each delta lake table, the below command would help
DESCRIBE HISTORY table_name
If you need to see each notebook level versioning, you can look in the UI itself. This will help to see the changes in each version of your notebook and revert back to any previous versions. But if your requirement is on a complete workspace level, then using Version Control tools like Git or Bitbucket should help.
Please check the below links for some more idea on this.
Azure Databricks & Version Management | by Prosenjit Chakraborty | Medium
Version Control in Databricks Notebook (bigdataprogrammers.com)
btw @Sanjay Jain you selected your response as the best answer. Hope it's a mistake. If you feel my answer helped, please select mine as best answer.
03-21-2023 04:13 AM
Hi @Sanjay Jain ,
Here are some of the ways
Deep Clone: https://www.databricks.com/wp-content/uploads/notebooks/using-deep-clone-disaster-recovery-delta-lak...
Repos for notebooks and code: https://docs.databricks.com/repos/index.html
https://docs.databricks.com/administration-guide/disaster-recovery.html this covers further details on DR.
Thanks & Regards,
Nandini
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group