Spark Executor - Parallelism Question
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Tuesday
I was reading the book Spark: The Definitive Guide, I came across below statement in Chapter 2 on partitions.
"If you have many partitions but only one executor, Spark will still have a parallelism of only one because there is only one computation resource."
I am little puzzled, is the above statement makes assumption that executor is having only one core ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Wednesday
Hey @SANJAYKJ
It is correct in the sense that a single executor is a limiting factor, but the actual parallelism within that executor depends on the number of cores assigned to it. If you want to leverage multiple partitions effectively, you either need more executors or executors with more cores.
If the single executor has multiple cores, then it can process multiple partitions at the same time, up to the number of available cores.
🙂
Isi

