cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

CI/CD Databricks Asset Bundles - DLT pipelines - unity catalog and target schema

MarinD
New Contributor II

Is it possible for the CI/CD Databricks Asset Bundles YAML file to describe unity catalog and target schema as destination needed for the DLT pipeline? Or that's just not possible today.

In case this functionality is not possible today, are there any plans to make it available?

1 ACCEPTED SOLUTION

Accepted Solutions

Kaniz
Community Manager
Community Manager

Hi @MarinD , As of now, Databricks Asset Bundles do not directly support specifying the Unity Catalog and target schema as the destination for a Delta Live Tables (DLT) pipeline within the YAML configuration file.

However, letโ€™s delve into the details:

  1. Databricks Asset Bundles and CI/CD:

    • Databricks Asset Bundles allow you to package and deploy Databricks assets (such as notebooks, libraries, and jobs) in a structured manner.
    • They are useful for automating and customizing CI/CD workflows within your GitHub repositories using GitHub Actions and Databricks CLI.
    • You can define bundle configurations in YAML files to manage your assets.
  2. Unity Catalog Support for DLT Pipelines:

    • Unity Catalog is a powerful feature that allows you to define and manage tables, views, and materialized views within Databricks.
    • With Unity Catalog, you can:
      • Define a catalog where your pipeline will persist tables.
      • Read data from Unity Catalog tables.
      • Query tables created by pipelines using both Python and SQL interfaces.
      • Use shared Unity Catalog clusters with Databricks Runtime 13.1 and above or a SQL warehouse.
    • However, there are some limitations:
      • Existing pipelines that use the Hive metastore cannot be upgraded to use Unity Catalog.
      • A single pipeline cannot write to both the Hive metastore and Unity Catalog.
      • Existing pipelines not using Unity Catalog remain unaffected and continue to persist data to the Hive metastore.
  3. Specifying Catalog and Schema:

    • To create tables in Unity Catalog from a DLT pipeline, you need:
      • USE CATALOG privileges on the target catalog.
      • CREATE MATERIALIZED VIEW and USE SCHEMA privileges in the target schema (if your pipeline creates materialized views).
      • CREATE TABLE and USE SCHEMA privileges (if your pipeline creates streaming tables).
      • If no target schema is specified in the pipeline settings, you need privileges on at least one schema in the target catalog.
  4. Future Plans:

    • While there are no specific details about plans to directly specify Unity Catalog and target schema in the YAML file for DLT pipelines, Databricks continually enhances its features.
    • Itโ€™s possible that future updates may address this limitation, but as of now, itโ€™s not natively supported.

In summary, while Unity Catalog is a powerful tool for managing metadata and tables, directly specifying it in the DLT pipeline YAML file is not currently feasible. Keep an eye on Databricks updates for any enhancements in this area! ๐Ÿš€๐Ÿ”

For more information, you can refer to the official Databricks documentation on Unity Catalog with Delta Live Tables.1

 

View solution in original post

1 REPLY 1

Kaniz
Community Manager
Community Manager

Hi @MarinD , As of now, Databricks Asset Bundles do not directly support specifying the Unity Catalog and target schema as the destination for a Delta Live Tables (DLT) pipeline within the YAML configuration file.

However, letโ€™s delve into the details:

  1. Databricks Asset Bundles and CI/CD:

    • Databricks Asset Bundles allow you to package and deploy Databricks assets (such as notebooks, libraries, and jobs) in a structured manner.
    • They are useful for automating and customizing CI/CD workflows within your GitHub repositories using GitHub Actions and Databricks CLI.
    • You can define bundle configurations in YAML files to manage your assets.
  2. Unity Catalog Support for DLT Pipelines:

    • Unity Catalog is a powerful feature that allows you to define and manage tables, views, and materialized views within Databricks.
    • With Unity Catalog, you can:
      • Define a catalog where your pipeline will persist tables.
      • Read data from Unity Catalog tables.
      • Query tables created by pipelines using both Python and SQL interfaces.
      • Use shared Unity Catalog clusters with Databricks Runtime 13.1 and above or a SQL warehouse.
    • However, there are some limitations:
      • Existing pipelines that use the Hive metastore cannot be upgraded to use Unity Catalog.
      • A single pipeline cannot write to both the Hive metastore and Unity Catalog.
      • Existing pipelines not using Unity Catalog remain unaffected and continue to persist data to the Hive metastore.
  3. Specifying Catalog and Schema:

    • To create tables in Unity Catalog from a DLT pipeline, you need:
      • USE CATALOG privileges on the target catalog.
      • CREATE MATERIALIZED VIEW and USE SCHEMA privileges in the target schema (if your pipeline creates materialized views).
      • CREATE TABLE and USE SCHEMA privileges (if your pipeline creates streaming tables).
      • If no target schema is specified in the pipeline settings, you need privileges on at least one schema in the target catalog.
  4. Future Plans:

    • While there are no specific details about plans to directly specify Unity Catalog and target schema in the YAML file for DLT pipelines, Databricks continually enhances its features.
    • Itโ€™s possible that future updates may address this limitation, but as of now, itโ€™s not natively supported.

In summary, while Unity Catalog is a powerful tool for managing metadata and tables, directly specifying it in the DLT pipeline YAML file is not currently feasible. Keep an eye on Databricks updates for any enhancements in this area! ๐Ÿš€๐Ÿ”

For more information, you can refer to the official Databricks documentation on Unity Catalog with Delta Live Tables.1

 
Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.