08-22-2023 04:46 PM - edited 08-22-2023 04:55 PM
Hi Databricks community,
I'm using Databricks Jobs Cluster to run some jobs. I'm setting the worker and driver type to AWS m6gd.large, which has 2 cores and 8G of memory each.
After seeing it's defaulting executor memory to 2G, I wanted to increase it, setting "spark.executor.memory 6g" on spark config on cluster setup. Upon setting it, it says I can't set it to such number, indicating the max value I can do is 2G (see the attachment). Given the worker has 8G memory, why is it limited to only 2G ? Similar situation for large worker types, the limit seems to be much lower than what should be available.
08-28-2023 12:07 PM - edited 08-28-2023 12:08 PM
I think I found the right answer here: https://kb.databricks.com/en_US/clusters/spark-shows-less-memory
It seems it sets fixed size of ~4GB is used for internal node services. So depending on the node type, `spark.executor.memory` is fixed by Databricks and can't be adjusted further.
All the parameters mentioned above would be applicable for the leftover (2GB) available for the execution, as in the proportion within the leftover can be played around.
The thread helped me understand how the memory is being set. Good lesson.
08-23-2023 05:14 AM
Hi @938452, The limitation you're seeing with the Spark executor memory might be due to the overhead memory reserved by the system for internal processes. In Spark, the executor memory consists of the Spark executor memory (spark.executor.memory
) and the executor's overhead memory (spark.executor.memoryOverhead
). The executor memory overhead is the off-heap memory used for JVM overheads, interned strings, other native overheads, etc. The value spark.executor.memoryOverhead
is by default max(384, .10 * spark.executor.memory). If you're setting spark.executor.memory
it to 8G, the system might reserve around 800MB as overhead memory, leaving less than 8GB for the executor memory.
You can adjust the spark.executor.memoryOverhead
value to a lower percentage for a more significant executor memory.
However, be careful not to set it too low, as it might lead to OutOfMemoryErrors.
However, please note that the specific memory allocation can vary depending on other factors, such as the particular configuration of the Databricks runtime and the AWS instance type.
Sources:
- https://docs.databricks.com/workflows/jobs/settings.html
- https://docs.databricks.com/archive/compute/configure.html
- https://docs.databricks.com/workflows/jobs/jobs-2.0-api.html
08-25-2023 11:09 AM - edited 08-25-2023 12:26 PM
@Kaniz thanks for the reply.
I'm going to try it but I don't think it fully addresses the issue. According to your explanation, given a 8GB worker, on default, it will reserve ~800MB for the overhead memory. It still leaves ~7GB available, and yet, the platform limited spark.executor.memory to around 2GB (see screenshot in initial post). There could be other reserves for other things, but it is decent portion of memory left that should be available that I should be able to get allocation for.
EDIT: I tried lowering spark.executor.memoryOverheadFactor. The limitation value is still the same.
08-28-2023 01:52 AM
Hi @938452 ,
spark.executor.memory
.spark.executor.memory
It is a subset of total memory and is divided into execution memory and storage memory.spark.executor.memory
is limited to around 2GB, so the memory allocation for execution and storage is limited.spark.executor.memory
, spark.memory.fraction
, and spark.memory.storageFraction
.spark.executor.memory
increases total memory but may reduce memory for the user and unroll memory.spark.memory.fraction
allocates more memory to Spark at the expense of user memory.spark.memory.storageFraction
provides more memory for execution by reducing memory for cached RDD storage.08-28-2023 12:07 PM - edited 08-28-2023 12:08 PM
I think I found the right answer here: https://kb.databricks.com/en_US/clusters/spark-shows-less-memory
It seems it sets fixed size of ~4GB is used for internal node services. So depending on the node type, `spark.executor.memory` is fixed by Databricks and can't be adjusted further.
All the parameters mentioned above would be applicable for the leftover (2GB) available for the execution, as in the proportion within the leftover can be played around.
The thread helped me understand how the memory is being set. Good lesson.
Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections.
Click here to register and join today!
Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.