cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Seeing history even after vacuuming the Delta table

Kavi_007
New Contributor III

Hi,

I'm trying to do the vacuum on a Delta table within a unity catalog. The default retention is 7 days. Though I vacuum the table, I'm able to see the history beyond 7 days. Tried restarting the cluster but still not working. What would be the fix ? 

1 ACCEPTED SOLUTION

Accepted Solutions

Lakshay
Databricks Employee
Databricks Employee

Hi @Kavi_007 , Vacuum only deletes stale files i.e. the files that are not being tracked by delta log. If you perform a delete operation, then the files will become stale and will be cleared after 7 days. From the delta history of your table, I do not see any delete operation. So, vacuum will not delete any file.

View solution in original post

6 REPLIES 6

Kavi_007
New Contributor III

Yes, below are the commands. 

 

VACUUM `dev-sales-catalog`.silver.orders   -- This does not do vaccum
 
Kavi_007_0-1712201764703.png
SELECT * FROM `dev-sales-catalog`.silver.orders VERSION AS OF 0        --  This still does bring the results with the version which is 10 days older.
 0Kavi_007_1-1712201816573.png

History of the table:

Kavi_007_3-1712201997933.png

 


 


 

 
 

Kavi_007
New Contributor III

 @Retired_mod  - Could you please check this? 

Lakshay
Databricks Employee
Databricks Employee

Hi @Kavi_007 , Vacuum only deletes stale files i.e. the files that are not being tracked by delta log. If you perform a delete operation, then the files will become stale and will be cleared after 7 days. From the delta history of your table, I do not see any delete operation. So, vacuum will not delete any file.

Kavi_007
New Contributor III

@Lakshay - you are right. I performed a couple of delete statements and then did VACCUM. It worked now. Thanks for your help !

Lakshay
Databricks Employee
Databricks Employee

Happy to help!

Kavi_007
New Contributor III

No, that's wrong. VACUUM removes all files from the table directory that are not managed by Delta, as well as data files that are no longer in the latest state of the transaction log for the table and are older than a retention threshold.

VACUUM - Azure Databricks - Databricks SQL | Microsoft Learn

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group