03-25-2026 06:43 PM
Hi,
I am currently designing a PII governance framework to meet CCPA compliance requirements on Databricks. I understand that Databricks provides mechanisms such as VACUUM and Deletion Vectors combined with REORG … APPLY (PURGE) to permanently remove data. With a well‑designed deletion workflow across the Lakeflow / medallion architecture, end‑to‑end PII deletion can be achieved.
However, I would like to understand whether Databricks offers any native feature, service, or managed capability that can reduce the operational overhead of implementing and maintaining this workflow, and help centrally orchestrate and enforce PII deletions across the entire lakehouse, rather than relying primarily on custom pipelines and control tables.
Thanks in advance and really appreciate your response.
03-26-2026 02:13 AM
Hi @abhijit007 ,
A new Data Classification feature (currently in Public Preview), allows to automatically classify and tag sensitive data in your catalog. It goes through few steps:
system.data_classification.results;Check for more details:
Best regards,
03-26-2026 01:20 AM
Hi @abhijit007,
No, Databricks still does NOT provide a native, centralized PII deletion orchestration service across the lakehouse. Though as you mentioned right, it's achievable through custom pipelines and control tables.
Check this - Prepare your data for GDPR compliance | Databricks on AWS
Thanks.
03-27-2026 01:03 AM
Hi @Sumit_7 ,
Thanks for the details. It's helpful.
03-26-2026 02:13 AM
Hi @abhijit007 ,
A new Data Classification feature (currently in Public Preview), allows to automatically classify and tag sensitive data in your catalog. It goes through few steps:
system.data_classification.results;Check for more details:
Best regards,
03-27-2026 01:05 AM
Hi @aleksandra_ch ,
Thanks .. The notebook reference is helpful.