overhead memory (used for things like off-heap storage and shuffle operations) is separate from spark.executor.memory.

Let's break this down clearly:

1. Memory Breakdown in Spark Executors

Each executor's total memory consists of the following components:

a) Spark Executor Memory (spark.executor.memory):

b) Spark Memory Overhead (spark.executor.memoryOverhead):

Default Value=max(384MB, 0.1 * spark.executor.memory).

Configurable using spark.executor.memoryOverhead.

c) Total Executor Memory:

The total memory allocated per executor is the sum of the two:

 

Total Executor Memory = spark.executor.memory + spark.executor.memoryOverhead

2. Cluster Memory Configuration in Your Case

Given the setup you described:

  • Machine Memory: 16GB per worker node.
  • Spark Executor Memory: 7.6GB per executor (spark.executor.memory).
  • Available for Overhead: The remaining memory on the machine after accounting for spark.executor.memory.

Let’s calculate the breakdown:

  1. Executor JVM Memory: 7.6GB is reserved for spark.executor.memory.
  2. Overhead Memory:
    • By default, spark.executor.memoryOverhead is max(384MB, 0.1 * spark.executor.memory), i.e., max(384MB, 0.76GB) = 0.76GB in your case.
    • This leaves 16GB - (7.6GB + 0.76GB) = ~7.64GB for the OS, YARN, or other processes

Mark it as solution if this helps.

Regards,

Avinash N

View solution in original post