UC upgrade in Spark Streaming jobs
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thursday
Kindly share the recommended approach for upgrading from HMS to UC for structured streaming jobs, ensuring seamless execution without any failures or data duplication? I would also appreciate insights into any best practices you have followed during similar upgrades.
- Labels:
-
Spark
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Friday
Hi Vetrivel,
How are you doing today?, As per my understanding, Upgrading from Hive Metastore (HMS) to Unity Catalog (UC) for structured streaming jobs needs a careful approach to avoid failures or data duplication. The best way is to first pause all streaming jobs, then migrate your tables to UC while making sure table locations and checkpoint directories stay the same. After that, update your jobs to use the new UC table names (like catalog.schema.table), and then restart the jobs using the same checkpoints so they continue from where they left off. It’s a good idea to test everything in dev or staging first, check for any issues, and only then move to production. Also, using views or table aliases can make the transition smoother with minimal code changes. Let me know if you need help setting this up or want a sample migration plan!
Regards,
Brahma

