Hi,
In my opinion, Databricks Deep Clone does not currently support cloning Unity Catalog tables natively across different metastores (each region having its own metastore). Deep Clone requires that both source and target belong to the same metastore context, so this approach wonโt work out of the box for your DR strategy across primary and secondary regions.
That said, here are a few alternative approaches you could consider for achieving your DR objective:
1. Delta Sharing between metastores
You could use Delta Sharing to expose the source tables from the primary region and then recreate or hydrate them in the secondary region. Delta Sharing supports cross-account and cross-region sharing, even across clouds.
However, itโs worth noting that Delta Sharing is optimized for data access and interoperability, not necessarily for high-throughput replication, and performance can be a concern โ especially for large or frequently changing tables.
2. File-level replication (e.g., AzCopy, Azure Data Factory)
Another robust approach is to replicate the underlying Delta Lake files using tools like AzCopy or Azure Data Factory, similar to what AWS DataSync provides.
This method is:
Once the data is in the target regionโs storage account, you can register the tables manually (or via automation) in the secondary Unity Catalog metastore. This essentially gives you a snapshot of the latest state of your tables.
3. Snapshots + Restore
If youโre using ADLS Gen2 with versioning or backup policies, you can take advantage of storage-level snapshots. In a DR event, you could restore those snapshots into a separate container or region and then rehydrate the tables in Databricks.
This method is slower in terms of RTO but can serve as a last-resort recovery strategy.
Hope this helps, ๐
Isi