szymon_dybczak
Esteemed Contributor III

Hi @noorbasha534 ,

Here's a general recommendation from Databricks. So they're recommending to run OPTIMIZE on compute optimized VMs and VACUUM on general purpose.

Comprehensive Guide to Optimize Data Workloads | Databricks

But as you said, VACCUM is compute intensive operation, so if you run it on F series that is also good approach. They even recommended to use that type of compute below:

szymon_dybczak_0-1753707124150.png

 


VACUUM best practices on Delta Lake - Databricks

As of ANALAYZE, this one collects metadata about the data, it's primarly I/O bound. General-purpose compute will be a good fit here in my opinion.