Hi @Jas, Based on the provided information, there is a way to synchronize the 2 Metastores in different regions using a script-based solution or a CI/CD workflow. This is part of the disaster recovery setup where you must replicate the correct data in the control plane, data plane, and data sources. The redundant workspaces for disaster recovery must map to different control planes in other regions, and you must keep that data in sync periodically.
For data sources, it is recommended to use native tools for replication and redundancy to replicate data to the disaster recovery regions.
In terms of tooling, two main approaches can be used to keep data as similar as possible between workspaces in your primary and secondary regions:
1. Synchronization client that copies from primary to secondary: A sync client pushes production data and assets from the primary region to the secondary region. This typically runs on a scheduled basis
2. CI/CD tooling for parallel deployment: For production code and assets, use CI/CD tooling that simultaneously pushes changes to production systems to both regions. For example, when pushing code and assets from staging/development to production, a CI/CD system makes it available in both regions at the same time. You can compare the metadata definitions between the Metastores using Spark Catalog API or Show Create Table via a notebook or scripts.
Note that the tables for underlying storage can be region-based and will be different between Metastore instances.
Sources:
- [Docs: disaster-recovery](https://docs.databricks.com/administration-guide/disaster-recovery.html)