3 weeks ago
The delta.logRetentionDuration (default 30 Days) is generally not set on any table in my workspace. As per the documentation you can time travel within duration of log retention provided delta.deletedFileRetentionDuration also set for 30days. Which is the case for my below example. We do vacuum with retain for 30days every weekend.
2 questions I have on this matter
https://docs.databricks.com/gcp/en/delta/history
#delta #vacuum
3 weeks ago
There are two configurations that govern your retention period:
Now, based on the above provided context, I will answer your questions:
Q) Why I can still go back to April 22nd in version which is more than 30 days?
You can't access the data beyond 7 days because delta.logRetentionDuration by default is 7 days. So, if you run the VACUUM operation after 7 days, those data files will have been deleted.
Q) Why version numbers starts from 100, what happened to previous versions?
You can only see versions up to 30 days old because the default value of delta.deletedFileRetentionDuration is 30 days.
3 weeks ago
There are two configurations that govern your retention period:
Now, based on the above provided context, I will answer your questions:
Q) Why I can still go back to April 22nd in version which is more than 30 days?
You can't access the data beyond 7 days because delta.logRetentionDuration by default is 7 days. So, if you run the VACUUM operation after 7 days, those data files will have been deleted.
Q) Why version numbers starts from 100, what happened to previous versions?
You can only see versions up to 30 days old because the default value of delta.deletedFileRetentionDuration is 30 days.
3 weeks ago
this is preciously my observation after vacuuming. I do understand these 2 parameters, but its not working as expected. Even after vacuuming(retention for 30 days) we can go back 2 months and log are retained for more than 3 months
3 weeks ago
Could you print out and provide the values of the 2 parameters?
@Ramukamath1988 wrote:
this is preciously my observation after vacuuming. I do understand these 2 parameters, but its not working as expected. Even after vacuuming(retention for 30 days) we can go back 2 months and log are retained for more than 3 months
Are the data that are 2 months old still referenced in the current data?
Passionate about hosting events and connecting people? Help us grow a vibrant local communityโsign up today to get started!
Sign Up Now