cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Impact of VACUUM and retention settings on Delta Lake

mjedy78
New Contributor II

I have a table that needs to support time travel for up to 6 months. To preserve the necessary metadata and data files, Iโ€™ve already configured the table with the following properties:

ALTER TABLE table_x SET TBLPROPERTIES ( 'delta.logRetentionDuration' = 'interval 180 days', 'delta.deletedFileRetentionDuration' = 'interval 180 days' );

However, there are delta lake maintanance job which runs

VACUUM table_x RETAIN 720 HOURS; -- 30 days

Will running VACUUM table_x RETAIN 720 HOURS ignore the 180-day table properties and potentially delete files needed for time travel?

Thanks

1 REPLY 1

intuz
Contributor II

Yes, the VACUUM table_x RETAIN 720 HOURS;  command will indeed override your table-level retention properties and potentially compromise your 6-month time travel capability. 
When you explicitly specify a retention period in the VACUUM command, it takes precedence over the table properties 'delta.deletedFileRetentionDuration'  and 'delta.logRetentionDuration' .

In your case, the maintenance job is enforcing a 30-day retention policy that conflicts with your 180-day table configuration. The vacuum operation will remove data files older than 30 days that may still be referenced by transaction logs within your 180-day retention window.
Hope this helps!

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local communityโ€”sign up today to get started!

Sign Up Now