cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Databricks Cluster Autoscaling

Mr__D
New Contributor II

Hello All,

Could anyone please suggest impact of Autoscaling in cluster cost ?

Suppose if I have a cluster where min worker is 2 and max is 10 but most of the time active worker are 3 so the cluster will be billed for only 3 workers or for 10 worker(though most of time 7 worker are ideal).

Thanks,

Deepak

1 REPLY 1

Anonymous
Not applicable

@Deepak Bhattโ€‹ :

Autoscaling in Databricks can have a significant impact on cluster cost, as it allows the cluster to dynamically add or remove workers based on the workload.

In the scenario you described, if the active worker count is consistently at 3, then the cluster will only be billed for 3 workers most of the time, regardless of the maximum number of workers set for the cluster. However, if there are occasional spikes in workload that require additional workers, then the cluster may temporarily scale up to meet the demand and incur additional costs during those periods.

The cost of autoscaling depends on the instance type and the duration of the scaling events. For example, adding or removing a worker may take a few minutes to complete, and during that time, the cluster will incur costs for the additional worker(s) even if they are not fully utilized. Additionally, larger instance types will have higher hourly costs than smaller ones, so using autoscaling with larger instances may result in higher costs.

To minimize costs with autoscaling, it's important to monitor cluster usage and adjust the minimum and maximum worker counts based on the workload patterns. Setting the minimum workers to the average number of workers needed during low periods can help to reduce costs, while still allowing for scaling up during high-demand periods. Similarly, setting the maximum workers to a reasonable level that meets the needs of the workload can help to avoid unnecessary costs associated with scaling up beyond what is necessary.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group