Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
Showing results for 
Search instead for 
Did you mean: 

How does deletedFileRetentionDuration and logRetentionDuration associated with Vacuum?

New Contributor III

I am trying to learn more about Vacuum operation and came across the two properties: 

  1. delta.deletedFileRetentionDuration
  2. delta.logRetentionDuration

So, let's say I have a delta table where few records/files have been deleted. The delta.deletedFileRetentionDuration has been set to default (7 days). delta.logRetentionDuration is set to default (30 days). 

What would happen if I run a vacuum against the table with interval 200 days? Following are some of the questions I have. I am a beginner and so, kindly correct if my understanding of the concept is wrong. 

  1. Will the deleted file be completely cleaned-up from storage only after 207 days (retention being 7 and vacuum interval 200 days)? 
  2. As the logRetentionDuration is set to only 30 days, from the 31st day I can neither see what delete transaction has happened on 1st day? and I would not be able to traverse back to the file deleted on day 1?
  3. If I have vacuum interval of 200 days, then ideally, I have to set the logRetentionDuration and deleteFileRetentionDuration also to 200 days? 

Thank you.



New Contributor II

No answers for those question?

I also find it not clear enough to understand this process of underlying parquet files retention.

Can someone help here?

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!