Databricks Community

804925 · ‎06-21-2023

Hi,

Running on Databricks-AWS, I have a job running on a cluster with 3 workers, 2-cores each (r6i.large), with autoscaling enabled.

The Spark job has two stages:

(1) highly parallelizable, cpu-intensive stage. This stage takes 15 minutes.

(2) a non-parallelizable stage (only a single partition, so a single spark task). This stage takes 45 minutes.

In the first stage, the cluster scales up from 1 worker to 3, and all 3 workers (6 cores) are fully utilized for the duration of the stage (15 minutes). Then, in the second stage, only a single worker node is active for the entire 45 minutes, but databricks does not scale down my cluster and I have two nodes completely idle for 45 minutes.

Any idea why that is, and how I can utilize autoscaling to be more cost efficient in this type of job?

Thanks!

Anonymous · ‎06-21-2023

Hi @Yoav Ben

Great to meet you, and thanks for your question!

Let's see if your peers in the community have an answer to your question. Thanks.

Databricks Community

Why doesn't my job-cluster scale down?

Join Us as a Local Community Builder!

Solution Accelerator Series | #5 - Automating Product Review Summarization with LLMs

The next BrickTalks about the latest and greatest in AI/BI is scheduled for Oct 28!

🚀 Weekly Delta (8 - 14 October): A Look Back at This Week’s Top Community Highlights

BrickCon 2025 — Dec 3–5 | A Community Conference for Databricks Builders

🌟 Community Sparks of the Week | September 26 – October 2 🌟