Legacy hive_metatstore corruption

thbeh_com
New Contributor III

I am seeing some legacy hive_metastore corruption (especially tables created as parquet instead of Delta) lately in my client's place, who is in the midst of migrating to UC. We were provided with a Scala code to remove the erroneous Parquet files physically. Anyone facing a similar issue?

lingareddy_Alva
Esteemed Contributor

HI @thbeh_com 

Yes, this is a fairly common issue during UC migrations, especially with legacy Hive metastore tables. The corruption typically happens because:
- Metadata-data misalignment - Hive metastore references files that no longer exist or have been moved
- Parquet schema evolution issues - Column changes not properly reflected in metastore
- Concurrent operations during migration causing inconsistent states
- File system operations bypassing Hive metastore updates

 

LR

View solution in original post

Thanks @lingareddy_Alva. Your points very much reflect the current situation. 

View solution in original post