cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

vacuum does not work as expected

Ramukamath1988
New Contributor II

The delta.logRetentionDuration (default 30 Days) is  generally not set on any table in my workspace. As per the documentation you can time travel within duration of log retention provided delta.deletedFileRetentionDuration also set for 30days. Which is the case for my below example. We do vacuum with retain for 30days every weekend.

2 questions I have on this matter

  • Why I can still go back to April 22nd in version which is more than 30 days?
  • Why version numbers starts from 100, what happened to previous versions?

https://docs.databricks.com/gcp/en/delta/history

#delta #
vacuum

1 ACCEPTED SOLUTION

Accepted Solutions

Raghavan93513
Databricks Employee
Databricks Employee

There are two configurations that govern your retention period:

  • delta.deletedFileRetentionDuration - This configuration specifies how long Delta's transaction logs are kept in the history. The default retention period is 30 days, after which older log entries may be deleted. 
  • delta.logRetentionDuration - This setting determines the retention period for stale data files that are no longer referenced in the transaction log. Stale files remain available for a default retention period of 7 days before they are eligible for deletion via the VACUUM command.

Now, based on the above provided context, I will answer your questions:

Q) Why I can still go back to April 22nd in version which is more than 30 days?

You can't access the data beyond 7 days because delta.logRetentionDuration by default is 7 days. So, if you run the VACUUM operation after 7 days, those data files will have been deleted.

Q) Why version numbers starts from 100, what happened to previous versions?

You can only see versions up to 30 days old because the default value of delta.deletedFileRetentionDuration is 30 days.

View solution in original post

3 REPLIES 3

Raghavan93513
Databricks Employee
Databricks Employee

There are two configurations that govern your retention period:

  • delta.deletedFileRetentionDuration - This configuration specifies how long Delta's transaction logs are kept in the history. The default retention period is 30 days, after which older log entries may be deleted. 
  • delta.logRetentionDuration - This setting determines the retention period for stale data files that are no longer referenced in the transaction log. Stale files remain available for a default retention period of 7 days before they are eligible for deletion via the VACUUM command.

Now, based on the above provided context, I will answer your questions:

Q) Why I can still go back to April 22nd in version which is more than 30 days?

You can't access the data beyond 7 days because delta.logRetentionDuration by default is 7 days. So, if you run the VACUUM operation after 7 days, those data files will have been deleted.

Q) Why version numbers starts from 100, what happened to previous versions?

You can only see versions up to 30 days old because the default value of delta.deletedFileRetentionDuration is 30 days.

Ramukamath1988
New Contributor II

 this is preciously my observation after vacuuming. I do understand these 2 parameters, but its  not working as expected. Even after vacuuming(retention for 30 days)  we can go back 2 months and log are retained for more than 3 months

Could you print out and provide the values of the 2 parameters?


@Ramukamath1988 wrote:

 this is preciously my observation after vacuuming. I do understand these 2 parameters, but its  not working as expected. Even after vacuuming(retention for 30 days)  we can go back 2 months and log are retained for more than 3 months


Are the data that are 2 months old still referenced in the current data?

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local communityโ€”sign up today to get started!

Sign Up Now