cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Vacuum on DLT

JothyGanesan
New Contributor III

We are currently using DLT tables in our target tables. The tables are getting loaded in continuous job pipelines.

The liquid cluster is enabled in the tables. Will Vacuum work on these tables when it is getting loaded in continuous mode? How to run the vacuum without impacting the checkpoints and the actual load of the DLT tables?

2 ACCEPTED SOLUTIONS

Accepted Solutions

emma_s
Databricks Employee
Databricks Employee

Hi, you shouldn't need to run these manually; they will automatically run as part of the liquid clustering, and in fact, it is actively discouraged to run them manually.

View solution in original post

iyashk-DB
Databricks Employee
Databricks Employee

VACUUM works fine on DLT tables running in continuous mode. DLT does automatic maintenance (OPTIMIZE + VACUUM) roughly every 24 hours if the pipeline has a maintenance cluster configured.

Q: The liquid cluster is enabled in the tables. Will Vacuum work on these tables when it is getting loaded in continuous mode? How to run the vacuum without impacting the checkpoints and the actual load of the DLT tables?
A: It wonโ€™t mess with checkpoints: VACUUM only removes orphaned data files and skips special dirs like _delta_log; DLT manages streaming checkpoints separately under the pipeline storage .../checkpoints/. Keep retention โ‰ฅ 7 days to stay safe.

View solution in original post

2 REPLIES 2

emma_s
Databricks Employee
Databricks Employee

Hi, you shouldn't need to run these manually; they will automatically run as part of the liquid clustering, and in fact, it is actively discouraged to run them manually.

iyashk-DB
Databricks Employee
Databricks Employee

VACUUM works fine on DLT tables running in continuous mode. DLT does automatic maintenance (OPTIMIZE + VACUUM) roughly every 24 hours if the pipeline has a maintenance cluster configured.

Q: The liquid cluster is enabled in the tables. Will Vacuum work on these tables when it is getting loaded in continuous mode? How to run the vacuum without impacting the checkpoints and the actual load of the DLT tables?
A: It wonโ€™t mess with checkpoints: VACUUM only removes orphaned data files and skips special dirs like _delta_log; DLT manages streaming checkpoints separately under the pipeline storage .../checkpoints/. Keep retention โ‰ฅ 7 days to stay safe.

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local communityโ€”sign up today to get started!

Sign Up Now