Hi @aranjan99,
There are two distinct autoscaling features in Databricks, and it helps to clarify the difference since the naming can be confusing.
OPTIMIZED AUTOSCALING (for job and all-purpose clusters)
If your workspace is on the Premium plan (or above), your clusters already use "optimized autoscaling" automatically. This is a more advanced scaling algorithm compared to the Standard plan and includes:
- Two-phase scale-up for faster resource allocation
- Mid-job scale-down by analyzing shuffle file state
- Faster downscale timing: job clusters scale down after 40 seconds of underutilization (vs. 150 seconds for all-purpose clusters)
- Percentage-based downscaling from the current node count
You do not need to enable this separately. If you are on a Premium workspace, it is active by default for all cluster types (job and all-purpose).
You can also tune the downscaling frequency with this Spark config property:
spark.databricks.aggressiveWindowDownS = <seconds>
This controls how often the cluster evaluates scale-down decisions. The maximum value is 600 seconds. Increasing it causes the cluster to scale down more slowly, which can be useful for bursty workloads.
To configure autoscaling on a job cluster, set the min and max workers in your job's cluster spec:
{
"autoscale": {
"min_workers": 2,
"max_workers": 10
}
}
Or in the Jobs UI, when configuring your job cluster, check "Enable autoscaling" and set the min/max worker range.
ENHANCED AUTOSCALING (pipeline-only)
"Enhanced autoscaling" is a separate, more advanced feature that is only available within Lakeflow Spark Declarative Pipelines (SDP, previously known as Delta Live Tables). It uses task-slot utilization and task-queue depth to make smarter scaling decisions, and can proactively shut down underutilized nodes without causing task failures.
Enhanced autoscaling is enabled by default for new SDP pipelines and is not available for regular job clusters or all-purpose clusters.
WHAT TO DO FOR JOB CLUSTERS
If you are looking for the best autoscaling behavior on job clusters:
1. Confirm your workspace is on the Premium plan to get optimized autoscaling.
2. Set appropriate min/max worker ranges for your workload.
3. Use the spark.databricks.aggressiveWindowDownS config to tune scale-down behavior if needed.
4. If your workload is streaming and you need enhanced autoscaling specifically, consider migrating to a Lakeflow Spark Declarative Pipeline, which supports enhanced autoscaling natively.
For more details, see the autoscaling section of the compute configuration docs:
https://docs.databricks.com/en/compute/configure.html
And for enhanced autoscaling in pipelines:
https://docs.databricks.com/en/delta-live-tables/auto-scaling.html
* This reply used an agent system I built to research and draft this response based on the wide set of documentation I have available and previous memory. I personally review the draft for any obvious issues and for monitoring system reliability and update it when I detect any drift, but there is still a small chance that something is inaccurate, especially if you are experimenting with brand new features.
If this answer resolves your question, could you mark it as "Accept as Solution"? That helps other users quickly find the correct fix.