Hello !
In DBKS we have 2 different autoscaling mechanisms here:
- normal compute autoscaling on a job or all purpose cluster : this is the autoscaling you enable on a classic job cluster by setting min or max workers and for structured streaming jobs, it is not recommended to enable compute autoscaling because scale down has limitations for streaming workloads. The cluster may not scale down as expected and if you want to resize you will experience latency especially for stateful streams.
- autoscaling for LSDP : this is a pipeline specific autoscaling mode that uses pipeline workload metrics such as task slot usage and queued tasks. It improves streaming workload optimization and can proactively shut down under used nodes while avoiding failed tasks during shutdown.
So shortly :
- for non SDP continuous auto loader jobs you can use fixed size jobs compute
- for non SDP available now auto loader jobs autoscaling can be reasonable
- for streaming autoscaling with better scale down behavior LSDP autoscaling is the recommended option
If this answer resolves your question, could you please mark it as โAccept as Solutionโ? It will help other users quickly find the correct fix.
Senior BI/Data Engineer | Microsoft MVP Data Platform | Microsoft MVP Power BI | Power BI Super User | C# Corner MVP