cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
missing-QuestionPost
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Why doesn't my job-cluster scale down?

804925
New Contributor II

Hi,

Running on Databricks-AWS, I have a job running on a cluster with 3 workers, 2-cores each (r6i.large), with autoscaling enabled.

The Spark job has two stages:

(1) highly parallelizable, cpu-intensive stage. This stage takes 15 minutes.

(2) a non-parallelizable stage (only a single partition, so a single spark task). This stage takes 45 minutes.

In the first stage, the cluster scales up from 1 worker to 3, and all 3 workers (6 cores) are fully utilized for the duration of the stage (15 minutes). Then, in the second stage, only a single worker node is active for the entire 45 minutes, but databricks does not scale down my cluster and I have two nodes completely idle for 45 minutes.

Any idea why that is, and how I can utilize autoscaling to be more cost efficient in this type of job?

Thanks!

1 REPLY 1

Anonymous
Not applicable

Hi @Yoav Benโ€‹ 

Great to meet you, and thanks for your question!

Let's see if your peers in the community have an answer to your question. Thanks.