cancel
Showing results forĀ 
Search instead forĀ 
Did you mean:Ā 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results forĀ 
Search instead forĀ 
Did you mean:Ā 

How to Migrate Legacy Dashboards from hive_metastore to Unity Catalog using Python

Sudheer2
New Contributor III

Hi all,

After updating the legacy dashboard APIs, I’m looking to migrate legacy dashboards from the hive_metastore to Unity Catalog in Databricks. Specifically, I want to programmatically:

  • Migrate SQL queries used in dashboards

  • Retain or recreate the associated data visualizations

  • Ensure compatibility with Unity Catalog schemas/tables

  • Perform the migration using Python

Has anyone successfully done this, or is there any recommended approach or API support for this kind of end-to-end migration?

Appreciate any guidance, best practices, or script examples.

Thanks in advance!

1 REPLY 1

BigRoux
Databricks Employee
Databricks Employee

For your consideration:

 

To migrate legacy dashboards from the Hive Metastore to Unity Catalog in Databricks programmatically while retaining SQL queries, data visualizations, and ensuring compatibility with Unity Catalog schemas and tables using Python, here are some approaches and best practices based on the search results:
  1. Migration of SQL Queries:
    • You can migrate SQL queries by rewriting them to use Unity Catalog's three-level namespace (catalog.schema.table), which is required for Unity Catalog compatibility.
    • If your SQL queries use tables from the Hive Metastore, these tables should first be migrated to Unity Catalog using utilities like CREATE TABLE AS SELECT or CREATE TABLE CLONE (preferably deep clones for Delta tables, which preserve Delta logs).
  2. Migrating Data Visualizations:
    • Data visualizations tied to queries in the legacy dashboards can be recreated by cloning the dashboards. Databricks provides tools for cloning legacy dashboards to Lakeview dashboards, which offers expanded API support for creating, getting, and updating dashboards programmatically.
    • Queries and visualizations can be copied to create new dashboards from the SQL editor, ensuring minimal modification during migration.
  3. Compatibility with Unity Catalog Schemas/Tables:
    • Make sure the schema and table definitions are adjusted for Unity Catalog. Unity Catalog supports managed tables in Delta format as well as external tables.
    • Use the SYNC SQL command or Hive Metastore federation to migrate external tables or enable a hybrid workspace for incremental migration.
    • For external locations, ensure adequate setup of Unity Catalog-managed access controls and storage credentials.
  4. Python Automation:
    • The Databricks Python SDK can be utilized for orchestrating migration processes. You can identify and migrate tables programmatically by leveraging SDK utilities (databricks.sdk.service.catalog) or issuing SQL commands programmatically via Databricks compute resources.
    • Databricks recommends using deep clone operations for managed Delta tables or the CREATE TABLE AS SELECT approach for broader migration options.
    • Dashboards can be manipulated programmatically using Databricks SQL Queries and Dashboard APIs.
  5. Additional Considerations for Dashboard Migration:
    • If dropdown list parameters or query-backed parameters exist in legacy dashboards, these should be migrated to comply with Unity Catalog standards.
    • Legacy dashboards can be cloned to Lakeview dashboards programmatically, making it easier to move previously embedded SQL queries and visualizations.
  6. Recommended Tools and Resources:
    • Enable Hive Metastore federation to maintain access to Hive tables during active migration processes. Federation simplifies incremental migration by allowing query execution from both Hive Metastore and Unity Catalog during the transition.
    • Prioritize migration tools like the UCX toolkit for managing schema and table upgrades and handling both managed and external Unity Catalog tables.
Challenges or additional steps may apply depending on the specifics of your dashboards or infrastructure. For a seamless migration, carefully assess data dependencies, storage formats, and any legacy constructs that may require adaptation to comply with Unity Catalog standards. Always test the migrated dashboards and data pipelines in a staging environment before final deployment.

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now