cancel
Showing results for 
Search instead for 
Did you mean: 
missing-QuestionPost
cancel
Showing results for 
Search instead for 
Did you mean: 

Why doesn't my job-cluster scale down?

804925
New Contributor II

Hi,

Running on Databricks-AWS, I have a job running on a cluster with 3 workers, 2-cores each (r6i.large), with autoscaling enabled.

The Spark job has two stages:

(1) highly parallelizable, cpu-intensive stage. This stage takes 15 minutes.

(2) a non-parallelizable stage (only a single partition, so a single spark task). This stage takes 45 minutes.

In the first stage, the cluster scales up from 1 worker to 3, and all 3 workers (6 cores) are fully utilized for the duration of the stage (15 minutes). Then, in the second stage, only a single worker node is active for the entire 45 minutes, but databricks does not scale down my cluster and I have two nodes completely idle for 45 minutes.

Any idea why that is, and how I can utilize autoscaling to be more cost efficient in this type of job?

Thanks!

1 REPLY 1

Anonymous
Not applicable

Hi @Yoav Ben​ 

Great to meet you, and thanks for your question!

Let's see if your peers in the community have an answer to your question. Thanks.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.