cancel
Showing results for 
Search instead for 
Did you mean: 
Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.
cancel
Showing results for 
Search instead for 
Did you mean: 

Spark Executor - Parallelism Question

SANJAYKJ
New Contributor II

I was reading the book Spark: The Definitive Guide, I came across below statement in Chapter 2 on partitions.

"If you have many partitions but only one executor, Spark will still have a parallelism of only one because there is only one computation resource."

I am little puzzled, is the above statement makes assumption that executor is having only one core ?

1 REPLY 1

Isi
Contributor

Hey @SANJAYKJ 

It is correct in the sense that a single executor is a limiting factor, but the actual parallelism within that executor depends on the number of cores assigned to it. If you want to leverage multiple partitions effectively, you either need more executors or executors with more cores.

If the single executor has multiple cores, then it can process multiple partitions at the same time, up to the number of available cores.

🙂

Isi

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group