How does deletedFileRetentionDuration and logRetentionDuration associated with Vacuum?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-10-2023 03:58 AM
I am trying to learn more about Vacuum operation and came across the two properties:
- delta.deletedFileRetentionDuration
- delta.logRetentionDuration
So, let's say I have a delta table where few records/files have been deleted. The delta.deletedFileRetentionDuration has been set to default (7 days). delta.logRetentionDuration is set to default (30 days).
What would happen if I run a vacuum against the table with interval 200 days? Following are some of the questions I have. I am a beginner and so, kindly correct if my understanding of the concept is wrong.
- Will the deleted file be completely cleaned-up from storage only after 207 days (retention being 7 and vacuum interval 200 days)?
- As the logRetentionDuration is set to only 30 days, from the 31st day I can neither see what delete transaction has happened on 1st day? and I would not be able to traverse back to the file deleted on day 1?
- If I have vacuum interval of 200 days, then ideally, I have to set the logRetentionDuration and deleteFileRetentionDuration also to 200 days?
Thank you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-21-2023 03:40 AM
No answers for those question?
I also find it not clear enough to understand this process of underlying parquet files retention.
Can someone help here?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-22-2024 09:20 PM
- Will the deleted file be completely cleaned-up from storage only after 207 days (retention being 7 and vacuum interval 200 days)?
As the default retention period is 7 days, there will not be any files older than 7 days, unless the retention period is explicitly set to a longer period. Performing VACUUM for 200 days, in general, tries to delete older files present in the last 200 days only. - As the logRetentionDuration is set to only 30 days, from the 31st day I can neither see what delete transaction has happened on 1st day? and I would not be able to traverse back to the file deleted on day 1?
Yes, this is correct. If needed logRetentionDuration can be set for a longer period. This will hold only the logs for this period and not the deleted files. - If I have vacuum interval of 200 days, then ideally, I have to set the logRetentionDuration and deleteFileRetentionDuration also to 200 days?
By default, delta handles the deletion of older files based on the deleteFileRetentionDuration. For VACUUM to delete data of last 200 days (and to not want delta by default delete the data older than 7 days) deleteFileRetentionDuration can be set to 200 days and the same can be applied for logRetentionDuration to preserve the logs.
data:image/s3,"s3://crabby-images/cb5bb/cb5bb73aed1093bf2bbc88d029c5de02e8c5cfc3" alt=""
data:image/s3,"s3://crabby-images/cb5bb/cb5bb73aed1093bf2bbc88d029c5de02e8c5cfc3" alt=""