topic Re: Databricks Execution Context scheduling in Administration & Architecture

Databricks Execution Context scheduling

tariq — Tue, 26 Aug 2025 04:58:10 GMT

Databricks allows a cluster to have a maximum of 150 execution context. If a number of execution context is attached and running operations on a databricks cluster with a driver and n workers, then how is the execution scheduled? I am assuming that only the driver cores can actually run the main execution context with any spark job being offloaded to workers. How does the scheduling for different execution contexts work? https://kb.databricks.com/notebooks/too-many-execution-contexts-are-open-right-now

Re: Databricks Execution Context scheduling

-werners- — Tue, 26 Aug 2025 06:42:52 GMT

I am not sure what your question exactly is.
The limit of 150 execution contexts is per cluster. If you schedule that many spark programs you can easily avoid this limit by using job clusters.

Can you explain a bit more what you want to know?
f.e. the innards of the databricks resource scheduler in case 2 spark programs run at the same time, or how spark distributes work or ...?

Re: Databricks Execution Context scheduling

tariq — Tue, 26 Aug 2025 06:52:33 GMT

My question is if the number of execution contexts is more than the cores available in the driver of the cluster, then how does Databricks schedules the execution since one execution context will require at least one core to run on. Is it round-robin or something else. To make things simple let's assume no spark code is getting executed so there won't be any offloading to workers.

Re: Databricks Execution Context scheduling

-werners- — Tue, 26 Aug 2025 07:35:30 GMT

Ok i get it.
An execution context is not bound to cpu. It is like a session. So basically the limit of 150 execution contexts mean that 150 sessions/spark programs can run simultaneously on the cluster (whether that is possible on the hardware is another question).
Knowing that, your question is in fact:
if the number of spark tasks is more than the cores available in the driver...
First: the driver does only orchestration, it does not run spark tasks (it does run native python code though, and if you call collect() etc). The workers execute tasks.
If there are more tasks than there are cpus available over all workers, then either extra nodes are created (if you use autoscale), or the tasks wait until resources become available.
This happens a lot actually, especially on tables with lots of partitions (which often exceed the number of cores).
Spark can handle this without any issue.
Timeouts can occur however.