E series vs F series VM's

Sainath368 — Mon, 28 Jul 2025 12:59:04 GMT

Hi all,
I need to run weekly maintenance on approximately 7,000 tables in my Databricks environment, involving OPTIMIZE, VACUUM, and ANALYZE TABLE (for statistics calculation) on all tables.

My question is: between the Ev4, Edv4, and Fsv2 VM series, which would be best suited for the driver and worker nodes in a Databricks cluster handling this workload, especially considering time constraints?

I’m looking for recommendations on the VM series that would minimize task completion times while balancing cost and resource efficiency.

Re: E series vs F series VM's

mani_22 — Mon, 28 Jul 2025 22:50:13 GMT

@Sainath368 OPTIMIZE and VACUUM are compute-intensive operations, so you can choose a compute-optimized instance like the F series for both drivers and workers, which has a higher CPU-to-memory ratio.

If its UC managed table, I recommend enabling Predictive optimization, which automatically runs VACUUM, OPTIMIZE and ANALYZE on a serverless compute.

Documentation: https://docs.databricks.com/aws/en/optimizations/predictive-optimization

topic E series vs F series VM's in Data Engineering

E series vs F series VM's

Re: E series vs F series VM's