Databricks Community

bhargavabasava · 3 weeks ago

Hi team,

We created a workflow and attached it to a job cluster (which is configured to use compute pool). When we run the pipeline, it takes up to 5 minutes to go into clusterReady state and this is adding latency to our use case. Even with subsequent runs, it's waiting for cluster to be ready. Can someone please help me understand how to reduce the overall latency and better way of using job compute.

We tried with serverless warehouse (non SQL) and it's adding around 20-25 seconds latency for each task in the job. In screenshot (Screenshot 2025-04-03 ar 3:43:23 PM), the task took 33 seconds but notebook cell has run only for 16 seconds. Would like to understand what is adding up to latency in this case.

Thanks & Regards,

Bhargava

Isi · 3 weeks ago

Hey @bhargavabasava ,

Job Cluster + Compute Pools: Long Startup Times

If you’re using Job Clusters backed by compute pools, the initial delay (~5 minutes) is usually due to cluster provisioning. While compute pools are designed to reduce cold start times by pre-warming VMs, startup latency can still occur if:

There are no idle VMs available in the pool (e.g., 0 clusters in idle state).
The cluster needs to install libraries or run init scripts, which adds to the boot time.

Serverless Jobs Latency (~20–25 seconds overhead)

The behavior you’re seeing where the notebook logic takes 16 seconds but the task duration is 33 seconds is expected when using Serverless compute for Jobs (non-SQL). There is a small but consistent overhead due to orchestration, environment setup, and logging.

That said, serverless jobs generally start much faster than job clusters and offer more predictable latency, so a 20–25 second overhead is considered normal.

Suggestions to Reduce Latency

Use instance pools with Idle Instance Auto Termination set to ~10 minutes. This allows reusing VMs across runs without incurring full provisioning times.
If you’re using isolated job clusters, try to chain multiple tasks in a single job using dependencies. This way, only the first task pays the cold-start penalty, and the following tasks run on the same cluster.

Hope this helps 🙂

Isi