Handling GDPR requests in databricks

Hubert-Dudek
Databricks MVP

When dealing with GDPR requests in databricks, there are some essential things to keep in mind:

- Use a low retention period to ensure you don't keep table delta version history for tables with personal information.

- Use APPLY CHANGES to handle Slowly Changing Dimension type 1. This way, you won't track history (like in type 2) and will have it in a separate table.

- When handling customer insertion and GDPR requests, use a changed data feed in databricks. Ensure the table is declared as LIVE, not STREAM, to ensure complete data reload and avoid records for which we have received GDPR requests.

ezgif-3-020e69a4fd.gif


My blog: https://databrickster.medium.com/

jose_gonzalez
Databricks Employee
Databricks Employee

Thank you for sharing this information @Hubert-Dudek!!!!