Re: Autoscaling with the autoloader without SDP

HTD360 · ‎05-11-2026

Hi there,

I have a question regarding the autoloader without SDP and auto-scaling of clusters. I'm reading the following in the docs:

Production considerations for Structured Streaming | Databricks on AWS:
Do not enable autoscaling for compute for Structured Streaming jobs.
Configure Auto Loader for production workloads | Databricks on AWS: Enhanced autoscaling implements optimization of streaming workloads and adds enhancements to improve the performance of batch workloads. Enhanced autoscaling optimizes costs by adding or removing machines as the workload changes.
But also: Compute auto-scaling has limitations when scaling down cluster size for structured streaming workloads. Databricks recommends using Lakeflow Spark Declarative Pipelines with enhanced autoscaling for streaming workloads.

We don't use SDP because of serverless limitations. Is it not advised to use enhanced autoscaling for non-SDP jobs? And why is that?

HTD360 · ‎05-11-2026

And to add to the question. What if I have a job with 10 tasks that all use the autoloader. Would that benefit from auto-scaling?

amirabedhiafi · ‎05-12-2026

Hello !

In DBKS we have 2 different autoscaling mechanisms here:

- normal compute autoscaling on a job or all purpose cluster : this is the autoscaling you enable on a classic job cluster by setting min or max workers and for structured streaming jobs, it is not recommended to enable compute autoscaling because scale down has limitations for streaming workloads. The cluster may not scale down as expected and if you want to resize you will experience latency especially for stateful streams.

- autoscaling for LSDP : this is a pipeline specific autoscaling mode that uses pipeline workload metrics such as task slot usage and queued tasks. It improves streaming workload optimization and can proactively shut down under used nodes while avoiding failed tasks during shutdown.

So shortly :

for non SDP continuous auto loader jobs you can use fixed size jobs compute
for non SDP available now auto loader jobs autoscaling can be reasonable
for streaming autoscaling with better scale down behavior LSDP autoscaling is the recommended option

If this answer resolves your question, could you please mark it as “Accept as Solution”? It will help other users quickly find the correct fix.

Senior BI/Data Engineer | Microsoft MVP Data Platform | Microsoft MVP Power BI | Power BI Super User | C# Corner MVP

HTD360 · ‎05-12-2026

Hi, thank you for your answer. Could you elaborate a bit on this?
for non SDP available now auto loader jobs autoscaling can be reasonable

How do you decide on whether it is reasonable or not? Especially you said it is not recommended to enable compute autoscaling because scale down has limitations for streaming workloads