Databricks Community

User16826992666 · ‎06-15-2021

What determines when the cluster autoscaling activates to add and remove workers? Also, can it be adjusted?

sajith_appukutt · ‎06-17-2021

> What determines when the cluster autoscaling activates to add and remove workers

During scale-down, the service removes a worker only if it is idle and does not contain any shuffle data. This allows aggressive resizing without killing tasks or recomputing intermediate results . It also scales the cluster up aggressively in response to demand to keep responsiveness high without sacrificing efficiency. More details at https://databricks.com/blog/2018/05/02/introducing-databricks-optimized-auto-scaling.html

>Also, can it be adjusted?

Databricks offers two types of cluster node autoscaling: standard and optimized. Depending on the type, the parameters you could tune are

spark.databricks.aggressiveWindowDownS
spark.databricks.autoscaling.standardFirstStepUp

View solution in original post

sajith_appukutt · ‎06-17-2021

> What determines when the cluster autoscaling activates to add and remove workers

During scale-down, the service removes a worker only if it is idle and does not contain any shuffle data. This allows aggressive resizing without killing tasks or recomputing intermediate results . It also scales the cluster up aggressively in response to demand to keep responsiveness high without sacrificing efficiency. More details at https://databricks.com/blog/2018/05/02/introducing-databricks-optimized-auto-scaling.html

>Also, can it be adjusted?

Databricks offers two types of cluster node autoscaling: standard and optimized. Depending on the type, the parameters you could tune are

spark.databricks.aggressiveWindowDownS
spark.databricks.autoscaling.standardFirstStepUp

Databricks Community

How does cluster autoscaling work?

Join Us as a Local Community Builder!

🚀 Announcing the Databricks Data Intelligence Platform Cheat Sheet

Find Sensitive Data at Scale with Data Classification in Unity Catalog

Solution Accelerator Series | #6 - Adverse Drug Event Detection

Announcing Backfill Runs in Lakeflow Jobs for Higher Quality Downstream Data

🚀 New: Databricks Interactive Architecture Design Workshops