@TinasheChinyati In-order to retain the 14 days history of a table you have tune the below parameters.
To query a previous table version, you must retain both the log and the data files for that version.
Data files are deleted when VACUUM
runs against a table. Delta Lake manages log file removal automatically after checkpointing table versions.
Because most Delta tables have VACUUM
run against them regularly, point-in-time queries should respect the retention threshold for VACUUM
, which is 7 days by default.
In order to increase the data retention threshold for Delta tables, you must configure the following table properties:
-
delta.logRetentionDuration = "interval <interval>"
: controls how long the history for a table is kept. The default is interval 30 days
.
-
delta.deletedFileRetentionDuration = "interval <interval>"
: determines the threshold VACUUM
uses to remove data files no longer referenced in the current table version. The default is interval 7 days
.
You must set both of these properties to ensure table history is retained for longer duration for tables with frequent VACUUM
operations. For example, to access 30 days of historical data, set delta.deletedFileRetentionDuration = "interval 30 days"
(which matches the default setting for delta.logRetentionDuration
).
In your case to retain 14 days records, keep the delta.deletedFileRetentionDuration = "interval 14 days"
https://docs.databricks.com/en/delta/history.html