Hello experts. We are trying to clarify how to clean up the large amount of files that are being accumulated in the _delta_log folder (json, crc and checkpoint files). We went through the related posts in the forum and followed the below:
SET spark.databricks.delta.retentionDurationCheck.enabled = false;
ALTER TABLE table_name
SET TBLPROPERTIES ('delta.logRetentionDuration'='interval 1 minutes', 'delta.deletedFileRetentionDuration'='interval 1 minutes');
VACUUM table_name RETAIN 0 HOURS
We understand that each time a checkpoint is written, Databricks automatically cleans up log entries older than the specified retention interval. However, after new checkpoints and commits, all the log files are still there.
Could you please help? Just to mention that it is about tables where we don't need any time travel.