cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Deleting Records from DLT Bronze and Silver Tables

Gianfranco
New Contributor II

I have a pipeline that generates two DLT streaming tables: a Bronze table and a Silver table. I need to delete specific records from both tables. I've read an article (https://www.databricks.com/blog/handling-right-be-forgotten-gdpr-and-ccpa-using-delta-live-tables-dl...) suggesting that with two streaming tables, the only option is to delete the data in the Bronze table and then perform a full refresh of the Silver table.

However, the Silver table is very large, and I'd like to avoid a full refresh. Are there any alternative solutions available?

Thanks

2 REPLIES 2

Brahmareddy
Valued Contributor III

Hi @Gianfranco,

How are you doing it today?

As per my understanding, Consider using a selective delete approach on both the Bronze and Silver tables to avoid a full refresh. Instead of deleting data and refreshing the entire Silver table, you could delete the specific records in both tables by writing an appropriate delete query directly against each Delta table. This way, you're only removing the required records without needing to rebuild the Silver table from scratch. Another option might be to implement Change Data Capture (CDC), so you can track and update only the affected records, reducing the need for full table operations. Additionally, explore using vacuum or optimize commands to maintain table performance after deletions.

Give a try and see if it works.

Good day.

Regards,

Brahma

filipniziol
Contributor III

Hi @Gianfranco ,

Try APPLY CHANGES. This will work well in your scenario, as it supports DELETE operation as well:
https://docs.databricks.com/en/delta-live-tables/cdc.html

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group