- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-07-2026 01:08 AM
I need to add PII tags at both the table and column levels for a streaming table created using Spark Declarative Pipelines.
I tried applying Unity Catalog tags with the following code inside the SDP Python pipeline:
spark.sql(f"""
ALTER TABLE {table_name}
SET TAGS ({tags_sql})
""")
However, this fails with the following error:
UNSUPPORTED_SPARK_SQL_COMMAND
'${command}' is not supported in spark.sql("...") API in SDP Python.
Supported command: ${supportedCommands}.
What is the correct way to define or apply PII tags for tables and columns created by Spark Declarative Pipelines?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-07-2026 02:18 AM
Hi @bi_123 !
You need to use UC tags outside the SPD definition not inside the SDP python function.
@dp.table(table_properties=...) can set table properties but those are not the same as UC tags and spark.sql("ALTER TABLE ...") inside SDP python is not supported because pipeline code is evaluated as a declarative graph and dataset functions should only define or return dataframes.
For your streaming table, you can use ALTER STREAMING TABLE not ALTER TABLE:
-- table level tag
ALTER STREAMING TABLE catalog.schema.my_streaming_table
SET TAGS ('pii' = 'true');
-- column level tags
ALTER STREAMING TABLE catalog.schema.my_streaming_table
ALTER COLUMN email SET TAGS ('pii' = 'email');
ALTER STREAMING TABLE catalog.schema.my_streaming_table
ALTER COLUMN ssn SET TAGS ('pii' = 'ssn');and run this from a DBKS sql env or as a post deployment CI/CD step after the pipeline creates or refreshes the table.
Senior BI/Data Engineer | Microsoft MVP Data Platform | Microsoft MVP Power BI | Power BI Super User | C# Corner MVP
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Tuesday
Hi @amirabedhiafi : Is it not possible to pass the StructField to the schema and then pass it to the
dlt.createStreamingTable (name, schema)
I tried passing the description of the columns to it and that works. I am wondering , why tags do not work 🙂