- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-28-2022 09:56 PM
I have a large delta table that I would like to back up and I am wondering what is the best practice for backing it up.
The goal is so that if there is any accidental corruption or data loss either at the Azure blob storage level or within Databricks itself I can restore the data.
Is using the Azure blob "Point-in-time" restore features appropriate? On paper, it sounds like it has all the features I require. However, what is the downstream effect of using it on a delta table and will weekly OPTIMIZE cause rewrites of the data and blow out the costs?
In other Azure/Databricks documentation, there was mention of using Deep Clone for data replication.
Any thoughts appreciated.
- Labels:
-
Azure
-
Backup
-
Blob
-
Deep Clone
-
Delta
-
Disaster recovery
-
Restore
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-28-2022 11:41 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-28-2022 11:41 PM
Deep Clone should do a good job for taking back up of delta tables.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-01-2022 12:25 AM
You can also set some copy process in Azure Data Factory
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-01-2022 12:39 AM
big advantage of file based storage (compared to rdmbs): copy/paste 🙂
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-27-2022 09:33 AM
Hi @deisou
Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark the answer as best? If not, please tell us so we can help you.
Cheers!