cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

How does running VACUUM on Delta Lake tables effect read/write performance?

User16783853906
Contributor III

If I don't run VACUUM on a Delta Lake table, will that make my read performance slower?

2 REPLIES 2

sajith_appukutt
Honored Contributor II

VACUUM does not have a direct impact on read/write performance since it only remove files no longer referenced by a Delta table ( unless your data volume is so high that you are hitting the read limits of underlying S3/GCS/ADLS buckets ) . It would make sense to run it as a separate job scheduled daily and potentially using sport instances

User16783853906
Contributor III

VACUUM has no effect on read/write performance to that table. Never running VACUUM on a table will not make read/write performance to a Delta Lake table any slower.

If you run VACUUM very infrequently, your VACUUM runtimes themselves may be pretty high, so it is suggested to run VACUUM somewhat regularly. How often you should run VACUUM depends on your storage costs.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group