Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-25-2025 12:26 AM
overhead memory (used for things like off-heap storage and shuffle operations) is separate from spark.executor.memory.
Let's break this down clearly:
1. Memory Breakdown in Spark Executors
Each executor's total memory consists of the following components:
a) Spark Executor Memory (spark.executor.memory):
b) Spark Memory Overhead (spark.executor.memoryOverhead):
Default Value=max(384MB, 0.1 * spark.executor.memory).
Configurable using spark.executor.memoryOverhead.
c) Total Executor Memory:
The total memory allocated per executor is the sum of the two:
Total Executor Memory = spark.executor.memory + spark.executor.memoryOverhead
2. Cluster Memory Configuration in Your Case
Given the setup you described:
- Machine Memory: 16GB per worker node.
- Spark Executor Memory: 7.6GB per executor (spark.executor.memory).
- Available for Overhead: The remaining memory on the machine after accounting for spark.executor.memory.
Let’s calculate the breakdown:
- Executor JVM Memory: 7.6GB is reserved for spark.executor.memory.
- Overhead Memory:
- By default, spark.executor.memoryOverhead is max(384MB, 0.1 * spark.executor.memory), i.e., max(384MB, 0.76GB) = 0.76GB in your case.
- This leaves 16GB - (7.6GB + 0.76GB) = ~7.64GB for the OS, YARN, or other processes
Mark it as solution if this helps.
Regards,
Avinash N