cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Unity Catalog Migration Strategy

ck7007
New Contributor II

Zero-Downtime Unity Catalog Migration for 500TB Data Lake

Just completed migrating 500TB to Unity Catalog without a single minute of downtime. Here's how:

The Challenge

  • 500 TB across 12,000 tables
  • 200+ concurrent users
  • Zero tolerance for downtime
  • Mixed Hive and Delta tables

The Solution: Parallel Sync Strategy

Step 1: Shadow Catalog Setup
def create_shadow_catalog(source_db, target_catalog):
"""
Creates a UC catalog that shadows the existing Hive metastore
"""
tables = spark.catalog.listTables(source_db)

for table in tables:
# Create external table pointing to same location
spark.sql(f"""
CREATE TABLE IF NOT EXISTS {target_catalog}.{source_db}.{table.name}
USING DELTA
LOCATION '{get_table_location(source_db, table.name)}'
"""

# Sync permissions
sync_table_permissions(source_db, table.name, target_catalog)


Step 2: Dual-Write Pattern
class DualWriter:
"""
Writes to both Hive and UC during transition
"""
def write_data(self, df, table_name):
# Write to original Hive table
df.write.mode("append").saveAsTable(f"hive_metastore.{table_name}")

# Simultaneously write to UC
df.write.mode("append").saveAsTable(f"main.prod.{table_name}")

# Verify consistency
assert verify_row_counts(f"hive_metastore.{table_name}",
f"main.prod.{table_name}")

Step 3: Smart Query Router
def route_query(query, user_group):
"""
Gradually routes traffic to UC
"""
migration_percentage = get _migration_percentage(user_group)

if random.random() < migration_percentage:
# Route to Unity Catalog
return query.replace("hive_metastore.", "main.prod.")
else:
# Keep on Hive
return query

Results

  • Migration time: 3 weeks (running in the background)
  • Downtime: ZERO
  • Failed queries: 0.01% (auto-retried)
  • Performance gain: 23% faster queries post-migration

Key Lessons

  1. Never use "DEEP CLONE" for large tables—too slow
  2. External tables are your friend—same data, different metadata
  3. Test with read-only users first—lower risk
  4. Monitor table drift—catches issues early

Rollback Strategy (Saved Us Twice!)

# Instant rollback if issues are detected in 
spark.conf.set("spark.sql.catalog.default", "hive_metastore")
# All queries automatically revert to Hive


Anyone else doing UC migration? What patterns worked for you?

P.S. created a full migration toolkit—DM if interested!

2 REPLIES 2

Khaja_Zaffer
Contributor

Hey @ck7007 


Thats a good hands on project!  what kind of toolkit do you have sir

BS_THE_ANALYST
Esteemed Contributor

@ck7007 I'd love to checkout the toolkit aswell & thanks for sharing this strategy. 

I'm curious, is this a typical migration to Unity Catalog? ☺️.

Also, with it taking 3 weeks, there must be some chance of data not being the "latest"? 🤔. Curious if that's the benefit of the downtime is that you can get all the latest over?

All the best,
BS

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now