cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Can we add a column comments for a materialized view on Azure Databricks?

Smriti2
New Contributor II

I want to understand whether itโ€™s possible to add or update column comments on an existing materialized view in Azure Databricks, and if so, what command should be usedโ€”especially when updating comments for multiple columns at once.

 

Hereโ€™s my situation:

 

  • The materialized view already exists.
  • It has around 50 columns in total.
  • About 23 columns inherited their comments automatically because they were selected directly from a base table that already had column comments.
  • Another 21 columns come from a different table that did not have column comments, so those comments are missing.
  • The remaining 6 columns are derived/calculated columns defined directly in the materialized view, and naturally they have no comments either.

 

 

I want to populate column comments for all missing columns, regardless of whether they come from another table or are computed in the materialized view itself.

 

The materialized view was created using something like:

CREATE OR REPLACE MATERIALIZED VIEW mv_name
COMMENT 'view comment'
AS
SELECT * FROM table;

3 REPLIES 3

So what can be done? Because for most of the columns there is no comments

youssefmrini
Databricks Employee
Databricks Employee

SteveOstrowski
Databricks Employee
Databricks Employee

Hi @Smriti2,

Yes, you can absolutely add column comments to a materialized view on Azure Databricks. There are two approaches you can use.


OPTION 1: ALTER MATERIALIZED VIEW (Recommended for Existing Views)

Since your materialized view already exists, you can use ALTER MATERIALIZED VIEW with the ALTER COLUMN clause to add comments to individual columns. This requires Databricks SQL or Databricks Runtime 16.2 and above.

ALTER MATERIALIZED VIEW my_catalog.my_schema.my_mv
ALTER COLUMN my_column_name COMMENT 'Description of this column';

You would run one ALTER statement per column. For your 27 columns that are missing comments, you can script this out. For example:

ALTER MATERIALIZED VIEW my_catalog.my_schema.my_mv
ALTER COLUMN column_a COMMENT 'Total revenue from sales';

ALTER MATERIALIZED VIEW my_catalog.my_schema.my_mv
ALTER COLUMN column_b COMMENT 'Customer identifier from source system';

ALTER MATERIALIZED VIEW my_catalog.my_schema.my_mv
ALTER COLUMN derived_col COMMENT 'Calculated field: col_x divided by col_y';

Documentation: https://learn.microsoft.com/en-us/azure/databricks/sql/language-manual/sql-ref-syntax-ddl-alter-mate...


OPTION 2: COMMENT ON COLUMN

You can also use the COMMENT ON statement, which supports materialized views. This requires Databricks SQL or Databricks Runtime 16.1 and above.

COMMENT ON COLUMN my_catalog.my_schema.my_mv.my_column_name
IS 'Description of this column';

Documentation: https://learn.microsoft.com/en-us/azure/databricks/sql/language-manual/sql-ref-syntax-ddl-comment


OPTION 3: Define Comments at Creation Time

If you ever recreate the materialized view, you can specify column comments inline in the CREATE statement by providing a column list with types and comments:

CREATE OR REPLACE MATERIALIZED VIEW my_catalog.my_schema.my_mv
(
column_a STRING COMMENT 'Total revenue from sales',
column_b INT COMMENT 'Customer identifier from source system',
derived_col DOUBLE COMMENT 'Calculated field: col_x divided by col_y'
)
COMMENT 'View-level comment here'
AS SELECT * FROM my_source_table;

Note that when specifying column comments this way, you must also include the column type. If the column type is omitted, the column comment is skipped.

Documentation: https://learn.microsoft.com/en-us/azure/databricks/sql/language-manual/sql-ref-syntax-ddl-create-mat...


SCRIPTING TIP FOR BULK UPDATES

Since you have 27 columns to update, you can generate the ALTER statements programmatically. For example, in a Databricks notebook:

comments = {
"column_a": "Description for column_a",
"column_b": "Description for column_b",
"derived_col": "Calculated field description",
}

mv_name = "my_catalog.my_schema.my_mv"

for col, comment in comments.items():
spark.sql(f"ALTER MATERIALIZED VIEW {mv_name} ALTER COLUMN {col} COMMENT '{comment}'")

You can verify the comments were applied by running:

DESCRIBE TABLE EXTENDED my_catalog.my_schema.my_mv;

This will show all columns along with their comments and data types.

Hope this helps you get all your column comments in place.

* This reply used an agent system I built to research and draft this response based on the wide set of documentation I have available and previous memory. I personally review the draft for any obvious issues and for monitoring system reliability and update it when I detect any drift, but there is still a small chance that something is inaccurate, especially if you are experimenting with brand new features.