โ03-25-2026 06:43 PM
Hi,
I am currently designing a PII governance framework to meet CCPA compliance requirements on Databricks. I understand that Databricks provides mechanisms such as VACUUM and Deletion Vectors combined with REORG โฆ APPLY (PURGE) to permanently remove data. With a wellโdesigned deletion workflow across the Lakeflow / medallion architecture, endโtoโend PII deletion can be achieved.
However, I would like to understand whether Databricks offers any native feature, service, or managed capability that can reduce the operational overhead of implementing and maintaining this workflow, and help centrally orchestrate and enforce PII deletions across the entire lakehouse, rather than relying primarily on custom pipelines and control tables.
Thanks in advance and really appreciate your response.
โ03-26-2026 02:13 AM
Hi @abhijit007 ,
A new Data Classification feature (currently in Public Preview), allows to automatically classify and tag sensitive data in your catalog. It goes through few steps:
system.data_classification.results;Check for more details:
Best regards,
โ03-26-2026 01:20 AM
Hi @abhijit007,
No, Databricks still does NOT provide a native, centralized PII deletion orchestration service across the lakehouse. Though as you mentioned right, it's achievable through custom pipelines and control tables.
Check this - Prepare your data for GDPR compliance | Databricks on AWS
Thanks.
โ03-27-2026 01:03 AM
Hi @Sumit_7 ,
Thanks for the details. It's helpful.
โ03-26-2026 02:13 AM
Hi @abhijit007 ,
A new Data Classification feature (currently in Public Preview), allows to automatically classify and tag sensitive data in your catalog. It goes through few steps:
system.data_classification.results;Check for more details:
Best regards,
โ03-27-2026 01:05 AM
Hi @aleksandra_ch ,
Thanks .. The notebook reference is helpful.