cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Governance
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

For Unity Catalog can we sync 2 Metastores which are in different regions?

Jas
New Contributor II

In our Production tenant we have majority of our workspaces in the North Europe region. 

We have disaster recovery set up and these workspaces are in West Europe, so will require 2 Metastores

If Unity Catalog was to get implemented, is there a way to sync the 2 Metastores in different regions?

2 REPLIES 2

Kaniz
Community Manager
Community Manager

Hi @JasBased on the provided information, there is a way to synchronize the 2 Metastores in different regions using a script-based solution or a CI/CD workflow. This is part of the disaster recovery setup where you must replicate the correct data in the control plane, data plane, and data sources. The redundant workspaces for disaster recovery must map to different control planes in other regions, and you must keep that data in sync periodically.

For data sources, it is recommended to use native tools for replication and redundancy to replicate data to the disaster recovery regions.

In terms of tooling, two main approaches can be used to keep data as similar as possible between workspaces in your primary and secondary regions:

1. Synchronization client that copies from primary to secondary: A sync client pushes production data and assets from the primary region to the secondary region. This typically runs on a scheduled basis 

2. CI/CD tooling for parallel deployment: For production code and assets, use CI/CD tooling that simultaneously pushes changes to production systems to both regions. For example, when pushing code and assets from staging/development to production, a CI/CD system makes it available in both regions at the same time. You can compare the metadata definitions between the Metastores using Spark Catalog API or Show Create Table via a notebook or scripts.

Note that the tables for underlying storage can be region-based and will be different between Metastore instances.

Sources:
- [Docs: disaster-recovery](https://docs.databricks.com/administration-guide/disaster-recovery.html)

Jas
New Contributor II

Thank you Kaniz - I **bleep** relay this information to my team and reach out if we have any issues when implementing. 

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.