Databricks Community

Karthik_2 · ‎01-04-2025

Hi,

We are planning to migrate the backend of our web application, currently hosted on App Service with an Azure SQL Database, to Azure Databricks as the data source. For this, we intend to use the SQL Warehouse in Databricks to execute queries and interact with our gold table in Unity Catalog.

Our anticipated peak load is 250 concurrent users. However, I couldn’t find any official documentation confirming whether the concurrency limits of SQL Warehouse are based on cluster size or a fixed number. In the official documentation, I noticed that Databricks recommends 10 concurrent queries per cluster, regardless of its size.

Could you please advise on this? Your guidance would be greatly appreciated.

Thank you!

Walter_C · ‎01-05-2025

Hello Karthik, many thanks for your question. Databricks SQL Warehouses use dynamic concurrency to handle varying demands. Unlike static-capacity warehouses, Databricks SQL adjusts compute resources in real time to manage concurrent loads and maximize throughput. Each warehouse size category has a fixed compute capacity per unit, but the system scales the number of resources to accommodate varying demands.

For any warehouse type, you choose a cluster size for its compute resources. Databricks recommends a cluster for every 10 concurrent queries. The maximum number of queries in a queue for all SQL warehouse types is 1000. The upscaling of clusters per warehouse is based on query throughput, the rate of incoming queries, and the queue size.

Here is a summary of the scaling behavior:

If the expected query load is less than 2 minutes, Databricks does not upscale.
If the expected query load is between 2 to 6 minutes, Databricks adds 1 cluster.
If the expected query load is between 6 to 12 minutes, Databricks adds 2 clusters.
If the expected query load is between 12 to 22 minutes, Databricks adds 3 clusters.
For every additional 15 minutes of expected query load beyond 22 minutes, Databricks adds 1 more cluster.

Additionally, a warehouse is always upscaled if a query waits for 5 minutes in the queue. If the load is low for 15 minutes, Databricks downscales the SQL warehouse, keeping enough clusters to handle the peak load over the last 15 minutes.

For serverless SQL warehouses, intelligent workload management (IWM) dynamically manages workloads by using machine learning models to predict the resource demands of incoming queries while monitoring the warehouse’s available compute capacity in real time. This allows for rapid upscaling to maintain low latency and quick downscaling to minimize costs when demand is low.

You can refer to: https://docs.databricks.com/en/compute/sql-warehouse/warehouse-behavior.html