cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Governance
Join discussions on data governance practices, compliance, and security within the Databricks Community. Exchange strategies and insights to ensure data integrity and regulatory compliance.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Adding comments to Streaming Tables created with SQL Server Data Ingestion

pdg27
New Contributor

I have been tasked with governing the data within our Databricks instance. A large part of this is adding Comments or Descriptions, and Tags to our Schemas, Tables and Columns in Unity Catalog.

For most objects this has been straight-forward, but one place where I'm running into issues is in adding Comments or Descriptions to Streaming Tables that were created through the SQL Server Data Ingestion "Wizard", described here: Ingest data from SQL Server - Azure Databricks | Microsoft Learn.

All documentation I have read about adding comments to Streaming Tables mentions adding the Comments to the Lakeflow Declarative Pipelines directly, which would work if we were creating our Lakeflow Declarative Pipelines through Notebooks and ETL Pipelines.

Does anyone know of a way to add these Comments? I see no options through the Data Ingestion UI or the Jobs & Pipelines UI.

Note: we did look into adding Comments and Tags through DDL commands and we managed to set up some Column Comments and Tags through this approach but the Comments did not persist, and we aren't sure if the Tags will persist.

1 ACCEPTED SOLUTION

Accepted Solutions

mark_ott
Databricks Employee
Databricks Employee

It is currently not possible to reliably add or persist comments or descriptions directly to Streaming Tables created via the SQL Server Data Ingestion Wizard in Databricks using the Data Ingestion UI or Jobs & Pipelines UI. All metadata managementโ€”including comments and tagsโ€”for Lakeflow Streaming Tables is expected to be handled within the Lakeflow Declarative Pipeline definitions themselves, or programmatically via code in Notebooks and ETL pipelines.โ€‹

DDL and UI Approaches

  • Attempts to add comments or tags to these Streaming Tables using Databricks SQL DDL commands (โ€œCOMMENT ON TABLEโ€ or โ€œALTER TABLE โ€ฆ SET TAGโ€) may appear to work at first, but typically do not persist after pipeline executions or refresh cycles. This is because the pipeline re-creates tables during refreshes, discarding any manually set metadata not present in the pipeline definition.โ€‹

  • Similarly, attempting to add or edit comments through the Catalog Explorer UI is unreliable for Streaming Tables if underlying pipeline-managed metadata is not updated or if required cluster settings (such as spark.databricks.delta.catalog.update.enabled) are not correctly set. For regular Delta or Unity Catalog tables, ensure that this setting is true to propagate comment metadata to the catalog UI, but it does not override pipeline-managed streaming tables.โ€‹

Recommendation & Workarounds

  • Lakeflow Pipeline Code: The only permanent way to add Comments or Descriptions to Streaming Tables is to modify the Lakeflow Declarative Pipeline code itself, adding comments at table creation or within the DLT definitions (e.g., using the comment argument in @Dlt.table). This requires exporting and editing pipeline code, which is not exposed in the Wizard-driven ingestion approach.โ€‹

  • Tags: Tags can sometimes be added via DDL, but their persistence depends on pipeline behavior. If the pipeline overwrites the table, manually applied tags may also be lost unless set in the pipeline definition.โ€‹

  • Governance Strategy: For environments requiring robust, persistent governance metadata, migrate from Wizard-based ingestion to code-defined Lakeflow pipelines. This gives full

View solution in original post

1 REPLY 1

mark_ott
Databricks Employee
Databricks Employee

It is currently not possible to reliably add or persist comments or descriptions directly to Streaming Tables created via the SQL Server Data Ingestion Wizard in Databricks using the Data Ingestion UI or Jobs & Pipelines UI. All metadata managementโ€”including comments and tagsโ€”for Lakeflow Streaming Tables is expected to be handled within the Lakeflow Declarative Pipeline definitions themselves, or programmatically via code in Notebooks and ETL pipelines.โ€‹

DDL and UI Approaches

  • Attempts to add comments or tags to these Streaming Tables using Databricks SQL DDL commands (โ€œCOMMENT ON TABLEโ€ or โ€œALTER TABLE โ€ฆ SET TAGโ€) may appear to work at first, but typically do not persist after pipeline executions or refresh cycles. This is because the pipeline re-creates tables during refreshes, discarding any manually set metadata not present in the pipeline definition.โ€‹

  • Similarly, attempting to add or edit comments through the Catalog Explorer UI is unreliable for Streaming Tables if underlying pipeline-managed metadata is not updated or if required cluster settings (such as spark.databricks.delta.catalog.update.enabled) are not correctly set. For regular Delta or Unity Catalog tables, ensure that this setting is true to propagate comment metadata to the catalog UI, but it does not override pipeline-managed streaming tables.โ€‹

Recommendation & Workarounds

  • Lakeflow Pipeline Code: The only permanent way to add Comments or Descriptions to Streaming Tables is to modify the Lakeflow Declarative Pipeline code itself, adding comments at table creation or within the DLT definitions (e.g., using the comment argument in @Dlt.table). This requires exporting and editing pipeline code, which is not exposed in the Wizard-driven ingestion approach.โ€‹

  • Tags: Tags can sometimes be added via DDL, but their persistence depends on pipeline behavior. If the pipeline overwrites the table, manually applied tags may also be lost unless set in the pipeline definition.โ€‹

  • Governance Strategy: For environments requiring robust, persistent governance metadata, migrate from Wizard-based ingestion to code-defined Lakeflow pipelines. This gives full