VACUUM with Azure Storage Inventory Report is not working
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-18-2025 01:45 AM
Could someone please advise regarding VACUUM with Azure Storage Inventory Report as i have failed to make it work.
DBR 15.4 LTS, VACUUM command is being run with USING INVENTORY clause, as follows:
VACUUM schema.table USING INVENTORY (
select 'https://xxx.blob.core.windows.net/' || ir.Name as path,
ir.`Content-Length` as length,
case when ir.hdi_isfolder is null then false else ir.hdi_isfolder end as isDir,
ir.`Last-Modified` as modificationTime
from inventory_raw ir
where ...
)it does not fail, however it does not VACUUM anything.
Describe history output is as follows:
VACUUM END {"numDeletedFiles":"0","numVacuumedDirectories":"1"}
VACUUM START {"numFilesToDelete":"0","sizeOfDataToDelete":"0"}
VACUUM END {"numDeletedFiles":"0","numVacuumedDirectories":"1"}
VACUUM START {"numFilesToDelete":"0","sizeOfDataToDelete":"0"}
At the same time VACUUM without INVENTORY clause, but with DRY RUN option shows 1k files to be vacuumed.
Can someone also advise if that USING INVENTORY clause really works on Databricks' version of Delta - i failed to find any information in official Databricks docs, only here: https://delta.io/blog/efficient-delta-vacuum/
Thank you