- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Tuesday
Hi @MVMZ,
What you’re seeing is expected for Unity Catalog managed tables.
The key detail is that for Unity Catalog managed tables, Databricks blocks time travel queries when the requested version is older than delta.deletedFileRetentionDuration, which is 7 days by default. This is called out in the public documentation for table history and time travel.
That explains why SELECT * FROM table VERSION AS OF 1 works while version 1 is still the current version, but starts failing as soon as version 2 is created. Before version 2 exists, you are effectively reading the current table state. Once version 2 is committed, version 1 becomes a historical version, and because it is already older than 168 hours, the query is treated as a time travel request beyond the allowed retention window and is blocked immediately.
So in this case, the behaviour is not dependent on whether you manually ran VACUUM. The current behaviour for Unity Catalog managed tables is that the query can be blocked based on the table’s delta.deletedFileRetentionDuration setting, even if the older files have not yet been manually cleaned up by you. The docs also explain that to query an older version successfully, Databricks needs both the log and the underlying data files for that version to still be retained, and that time travel should generally be planned around the VACUUM retention threshold.
Predictive Optimization can also be relevant here, because for Unity Catalog managed tables it can automatically run VACUUM, OPTIMIZE, and ANALYZE in the background. So even if nobody explicitly issued a VACUUM command, old files can still be cleaned up automatically when Predictive Optimization is enabled.
If the requirement is to keep older versions queryable for longer than 7 days, the fix is to increase delta.deletedFileRetentionDuration before relying on that history. The data retention section of the docs also notes that if you want, for example, 30 days of historical access, you should increase delta.deletedFileRetentionDuration accordingly, and ensure log retention is at least as long as well.
If this answer resolves your question, could you mark it as “Accept as Solution”? That helps other users quickly find the correct fix.
Ashwin | Delivery Solution Architect @ Databricks
Helping you build and scale the Data Intelligence Platform.
***Opinions are my own***