cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Optimize and Vaccum Command

Ramakrishnan83
New Contributor III

Hi team,

I am running a weekly purge process from databricks notebooks that cleans up chunk of records from my tables used for audit purposes. Tables are external tables. I need clarification on below items

1.Should I need to  run Optimize and Vacuum command ? . Very Minimal Read Queries are executed against the audit tables

2. If i need to run, should I add Optimize and vacuum command in the same notebook to shrink the storage layer?

3. What scenarios should i look for to optimize and vaccum command for tables involved in purge process

3.No Action. Will data bricks and Apache Spark framework takes care internally on optimizing ? 

1 REPLY 1

Hkesharwani
Contributor II

Hi Ramakrishnan83,
1. Vacume commands only work with delta tables, Vacume command will delete the parquet files older than the retention period which is by default 7 days.  Optimize will rather club the files in case any special serial is provided.
2. Ideally, as per the databricks recommendation if there is continuous data writing, then the optimize command should be executed daily.

3. Both the commands optimize and vacuum will optimize in different ways:

  • Optimize will collocate the data based on patterns in the dataset.

Vacuum will delete the paruqet files from the storage layer.
Please refer to the articles for more details.
https://docs.databricks.com/en/delta/optimize.html https://docs.databricks.com/en/sql/language-manual/delta-optimize.html

Harshit Kesharwani
Data engineer at Rsystema

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group