One worker is one executor and one physical node

MikeGo
Valued Contributor

Hi team,

Seems in Databricks, instead of like running Spark jobs on a k8s cluster, when a workflow running on a Job Compute/Cluster or instance pool, one physical node can only have one executor. Is this understanding right? If that is true, that means if I create a Job Cluster for my workflow with high-end instance type, I need to config the executor with bigger values? For example, if I specify Node type as 122GB+16core, as one node runs one executor the normal config on k8s like 

spark.executor.memory 16gspark.executor.cores 4

 will incur a big waste right?

Thanks

-werners-
Esteemed Contributor III

Executor = worker in databricks workflow context.
So indeed, if you set executor.cores = 4 and have 16 cores you would not use 12 cores.
So adjusting spark.executor.cores and spark.task.cpus might be a good idea.
There is also the option to define this dynamically (on job level!) with spark.dynamicAllocation.enabled.


MikeGo
Valued Contributor

Thanks. With spark.dynamicAllocation.enabled, each executor still uses one physical node right?

-werners-
Esteemed Contributor III

It can change the number of executors dynamically if I am not mistaken.
But maybe Databricks has hammered the 1-1 ratio in stone, something to test I'd say.