topic Re: Spark Memory Configuration– Request for Clarification in Data Engineering

Spark Memory Configuration– Request for Clarification

sowanth — Fri, 08 Aug 2025 16:31:36 GMT

Hi Team,
I have noticed the following Spark configuration is being applied, though it's not defined in our repo or anywhere in the policies:

spark.memory.offHeap.enabled = true
spark.memory.offHeap.size = Around 3/4 of the node instance memory (i.e 1-3X of executor memory)

This setup leaves around only 1/4 of the node's memory for executor allocation. While we can override this config setting in our own spark configuration but not sure how it is set.

Such large off-heap allocation is rarely needed for our case.

1, Do you have any specific recommendations to use these much off-heap memory?
2, May I know where the off-heap memory config is set in the Databricks cluster? Additionally, could you explain the rational behind allocating more off-heap memory than executor memory in this strategy?

Databricks Runtime version: 12.2 LTS (includes Apache Spark 3.3.2, Scala 2.12) and 13.3 LTS

Thanks & Regards,
Sowanth

Re: Spark Memory Configuration– Request for Clarification

Advika — Mon, 11 Aug 2025 15:25:29 GMT

Hello @sowanth!

Off-heap memory is automatically configured on some clusters to improve stability and reduce Java garbage collection issues, particularly for Photon or heavy caching workloads. This setting isn’t coming from your repo or policies but is applied at the cluster level. If your Spark jobs don’t require this much off-heap memory, you can adjust it by overriding spark.memory.offHeap.enabled and spark.memory.offHeap.size in the cluster’s Spark configuration.

https://kb.databricks.com/en_US/clusters/spark-executor-memory

Re: Spark Memory Configuration– Request for Clarification

sowanth — Wed, 13 Aug 2025 11:23:17 GMT

Hi @Advika,
Thanks for the details and much appreciate.
Yes, I already referred this document but I don't find anywhere how much benefit based on this default higher offHeap memory on these node types and benchmark details for the caching or other workloads.

Regards,

Sowanth

Re: Spark Memory Configuration– Request for Clarification

sowanth — Wed, 13 Aug 2025 13:26:22 GMT

Now I understand how it's automatically configured in our cluster along with the rationale behind this off-heap memory approach.

However, I have some concerns about this configuration:

General applicability: Most jobs don't actually require 70% off-heap memory allocation
Industry recommendations: Leading LLM models (Claude, GPT, DeepSeek AI) don't recommend such high off-heap memory usage. Suggesting very very less % that is from the executor memory.
Lack of benchmarks: I haven't found any test results or benchmarks supporting this configuration for caching or other workloads, even for GC optimization
Cost implications: While this might help in some edge cases, it doesn't seem beneficial for general use cases and could be significantly increasing our costs

Could you please share any benchmark data or test results you have for this specific job configuration? This would help us better understand the performance benefits versus the cost impact.

Best regards,
Sowanth