Friday - last edited Friday
Hi,
On a managed Delta table I get:
SELECT * FROM abc VERSION AS OF 25;
Error:
DELTA_UNSUPPORTED_TIME_TRAVEL_BEYOND_DELETED_FILE_RETENTION_DURATION Cannot time travel beyond delta.deletedFileRetentionDuration (168 HOURS).
Audit logs show VACUUM START/END executed by a service principal (GUID userName); I never ran VACUUM manually. Table properties don’t explicitly set delta.deletedFileRetentionDuration, and Predictive Optimization is enabled (inherited).
Questions:
Friday - last edited Friday
Hi @vidya_kothavale ,
1. The feature is called preditive optimization for manged table. Predictive optimization runs the following operations on Unity Catalog managed tables:
- OPTIMIZE
- VACCUM
- ANALYZE
You can read more here:
Predictive optimization for Unity Catalog managed tables - Azure Databricks | Microsoft Learn
2. You can disable predictive optimization for a catalog or schema in following way:
Predictive optimization for Unity Catalog managed tables | Databricks on AWS
ALTER CATALOG [catalog_name] { ENABLE | DISABLE | INHERIT } PREDICTIVE OPTIMIZATION;
ALTER { SCHEMA | DATABASE } schema_name { ENABLE | DISABLE | INHERIT } PREDICTIVE OPTIMIZATION;And if you want to increase retention period you can use following query:
ALTER TABLE table_name SET TBLPROPERTIES ('delta.deletedFileRetentionDuration' = '30 days');
3. You can't do much since VACUUM removes all files from the table directory that are not managed by Delta, as well as data files that are no longer in the latest state of the transaction log for the table and are older than a retention threshold. Of course you can try restore deleted files using cloud provider native options.
For instance if you have enabled soft deletes on Azure Storage you can try to use that.
If the answer was helpful, please consider marking it as accepted solution
Friday
Hi
The error DELTA_UNSUPPORTED_TIME_TRAVEL_BEYOND_DELETED_FILE_RETENTION_DURATION confirms that the underlying files required for Version 25 have been deleted from the storage. Since the metadata knows those files should be there but finds them gone, it blocks the query.
The service principal in the logs is Databricks Service executing Predictive Optimization automatically. Predictive Optimization is the standard for Unity Catalog managed tables that automatically handles OPTIMIZE and VACUUM operations in the background using serverless compute. It targets tables where it detects high file fragmentation or a build up of expired snapshots to maintain performance and to reduce storage costs.
You can
ALTER TABLE eud_poland.staging.pibb_extract_preprocessed
SET TBLPROPERTIES ('delta.deletedFileRetentionDuration' = '30 days');Opt out - You can disable the service for a specific catalog or schema. More details here
You must manually manage the tables for optimizations to avoid performance degradation if you disable it.
No. Once VACUUM is complete and the files are deleted from the storage, the old state of the data is gone.
Time Travel is not a long-term backup solution. You can use Delta Deep Clone for long term backup
Deep Clone - You can create a separate physical copy of the data and metadata.
CREATE TABLE eud_poland.staging.pibb_extract_backup DEEP CLONE eud_poland.staging.pibb_extract_preprocessed;
Predictive Optimization does not hit every table with the same frequency. This specific table is targeted because:
You are doing frequent operations on this table creating many files that trigger the optimization threshold.
Table Size/Growth: The service prioritizes tables where storage savings or performance gains are most significant.
Friday - last edited Friday
Hi @vidya_kothavale ,
You can disable predictive optimization for an account, a catalog, or a schema. All Unity Catalog managed tables inherit the account value by default. You can override the account default at the catalog or schema level.
To disable it for your account to below steps:
Predictive optimization for Unity Catalog managed tables - Azure Databricks | Microsoft Learn
"An account admin can enable predictive optimization for all metastores in an account. Catalogs and schemas inherit this setting by default, but you can override it at either level.
If the answer was helpful, please consider marking it as accepted solution
Friday
VACUUM will never delete files on the latest version even if Version 10 was not accessed or modified as it represents the current state of the table. VACUUM targets files that are not referenced by the recent version. It identifies files that were removed (due to DELETE/UPDATE etc in Versions 0 - 9) and if those specific files are not part of Version 10 and their deletion timestamp in the Log is older than the 7 day retention threshold, they are permanently deleted.
Friday - last edited Friday
Hi @vidya_kothavale ,
1. The feature is called preditive optimization for manged table. Predictive optimization runs the following operations on Unity Catalog managed tables:
- OPTIMIZE
- VACCUM
- ANALYZE
You can read more here:
Predictive optimization for Unity Catalog managed tables - Azure Databricks | Microsoft Learn
2. You can disable predictive optimization for a catalog or schema in following way:
Predictive optimization for Unity Catalog managed tables | Databricks on AWS
ALTER CATALOG [catalog_name] { ENABLE | DISABLE | INHERIT } PREDICTIVE OPTIMIZATION;
ALTER { SCHEMA | DATABASE } schema_name { ENABLE | DISABLE | INHERIT } PREDICTIVE OPTIMIZATION;And if you want to increase retention period you can use following query:
ALTER TABLE table_name SET TBLPROPERTIES ('delta.deletedFileRetentionDuration' = '30 days');
3. You can't do much since VACUUM removes all files from the table directory that are not managed by Delta, as well as data files that are no longer in the latest state of the transaction log for the table and are older than a retention threshold. Of course you can try restore deleted files using cloud provider native options.
For instance if you have enabled soft deletes on Azure Storage you can try to use that.
If the answer was helpful, please consider marking it as accepted solution
Friday
@szymon_dybczak
I want to disbled this property on workspace level then how I can do?
Friday - last edited Friday
Hi @vidya_kothavale ,
You can disable predictive optimization for an account, a catalog, or a schema. All Unity Catalog managed tables inherit the account value by default. You can override the account default at the catalog or schema level.
To disable it for your account to below steps:
Predictive optimization for Unity Catalog managed tables - Azure Databricks | Microsoft Learn
"An account admin can enable predictive optimization for all metastores in an account. Catalogs and schemas inherit this setting by default, but you can override it at either level.
If the answer was helpful, please consider marking it as accepted solution
Friday
Hi
The error DELTA_UNSUPPORTED_TIME_TRAVEL_BEYOND_DELETED_FILE_RETENTION_DURATION confirms that the underlying files required for Version 25 have been deleted from the storage. Since the metadata knows those files should be there but finds them gone, it blocks the query.
The service principal in the logs is Databricks Service executing Predictive Optimization automatically. Predictive Optimization is the standard for Unity Catalog managed tables that automatically handles OPTIMIZE and VACUUM operations in the background using serverless compute. It targets tables where it detects high file fragmentation or a build up of expired snapshots to maintain performance and to reduce storage costs.
You can
ALTER TABLE eud_poland.staging.pibb_extract_preprocessed
SET TBLPROPERTIES ('delta.deletedFileRetentionDuration' = '30 days');Opt out - You can disable the service for a specific catalog or schema. More details here
You must manually manage the tables for optimizations to avoid performance degradation if you disable it.
No. Once VACUUM is complete and the files are deleted from the storage, the old state of the data is gone.
Time Travel is not a long-term backup solution. You can use Delta Deep Clone for long term backup
Deep Clone - You can create a separate physical copy of the data and metadata.
CREATE TABLE eud_poland.staging.pibb_extract_backup DEEP CLONE eud_poland.staging.pibb_extract_preprocessed;
Predictive Optimization does not hit every table with the same frequency. This specific table is targeted because:
You are doing frequent operations on this table creating many files that trigger the optimization threshold.
Table Size/Growth: The service prioritizes tables where storage savings or performance gains are most significant.
Friday
If a Delta table has 10 historical versions and none of them have been modified or referenced in the last 7 days (the retention period), when VACCUM runs, does it delete all versions and their files, or does it keep the latest version and only delete older unused files?
Friday
VACUUM will never delete files on the latest version even if Version 10 was not accessed or modified as it represents the current state of the table. VACUUM targets files that are not referenced by the recent version. It identifies files that were removed (due to DELETE/UPDATE etc in Versions 0 - 9) and if those specific files are not part of Version 10 and their deletion timestamp in the Log is older than the 7 day retention threshold, they are permanently deleted.