AbhaySingh
Databricks Employee
Databricks Employee

Here is my action plan if it helps!

Phase 1: Foundation
  ☐ Migrate to UC managed tables (if not already)
  ☐ Enable Predictive Optimization at catalog level
  ☐ Set delta.deletedFileRetentionDuration per layer

Phase 2: Retention Policies  
  ☐ Enable Auto-TTL on Bronze tables (request Private Preview access)
  ☐ Enable Auto-TTL on Silver tables with appropriate windows
  ☐ Configure Azure lifecycle policies for archival tiers
  ☐ Set delta.timeUntilArchived on tables with lifecycle policies

Phase 3: Deletion Workflows
  ☐ Create GDPR/CCPA control table for deletion requests
  ☐ Build scheduled Workflow job: DELETE → REORG PURGE → VACUUM
  ☐ Use Materialized Views in Silver/Gold for automatic propagation
  ☐ Test with VACUUM DRY RUN before production runs

Phase 4: Auditability
  ☐ Set up dashboard on system.storage.predictive_optimization_operations_history
  ☐ Create SQL Alerts for TTL/VACUUM failures
  ☐ Document retention policies per catalog/schema
  ☐ Build compliance report: tables with/without retention policies

Phase 5: Future (Attribute-Based Retention when possible and if available on platform in future )
  ☐ Define governed tags for TTL time columns
  ☐ Apply catalog/schema-level retention policies via ABAC
  ☐ Monitor via Governance Hub

 

View solution in original post