How to remove checkpoints from DeltaLake table ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-30-2023 11:44 AM
How to remove checkpoints from DeltaLake table ?
I see that on my delta table exist a few checkpoints I want to remove the oldest one.
It seems that existing of it is blocking removing the oldest _delta_logs entries
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-31-2023 02:11 AM
Hi @Pawel Woj ,
If you want to keep your checkpoints X days, you can set delta.checkpointRetentionDuration to X days this way:
spark.sql(f"""
ALTER TABLE delta.`path`
SET TBLPROPERTIES (
delta.checkpointRetentionDuration = 'X days'
)
"""
)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-03-2023 04:31 AM
Hi,
This looks well, thanks - will be testing it.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-03-2023 05:33 AM
Test and let me know if it's working for you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-31-2023 07:09 PM
Hi @Pawel Woj
Hope everything is going great.
Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so we can help you.
Cheers!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-03-2023 04:34 AM
Hi
Proposed solution looks good, I will be testing it,
but can you tell me why I can't found this param in offical DB documentation ?
I would like to have more infromation about before apply this configuration on prod evn.
BR