@NamNguyenCypher
Delta Live Tables’ Python API does not currently honor column-mask metadata embedded in a PySpark StructType. Masking (and row filters) on DLT tables are only applied when you define your table with a DDL-style schema that includes a MASK clause (or via SQL).
Why your StructField(... metadata={"mask": "mask_all"}) isn’t working
The Python create_streaming_table(..., schema=StructType) call will publish the schema (data types, comments, nullability), but it does not inspect StructField.metadata for mask or maskingPolicy keys. https://docs.databricks.com/aws/en/dlt-ref/dlt-python-ref-streaming-table?utm_source=chatgpt.com
Column masks in DLT are applied at the table definition level via SQL’s MASK clause, not via Spark schema metadata. https://docs.azure.cn/en-us/databricks/dlt/sql-ref?utm_source=chatgpt.com
Use a SQL-DDL string for your schema
Pass a single string to the schema parameter that embeds the MASK expression inline, e.g.:
import dlt
dlt.create_streaming_table(
name="account",
schema="""
account_id STRING,
email STRING,
ssn STRING
MASK my_catalog.my_schema.ssn_mask_fn()
COMMENT 'SSN masked for privacy'
""",
comment="Masked account stream",
path="/mnt/dlt/account",
partition_cols=["account_id"]
)
Here, ssn gets masked by the UDF ssn_mask_fn() every time it’s read.
LR