cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Question about monitoring driver memory utilization

wojciech_jakubo
New Contributor III

Hi databricks/spark experts!

I have a piece on pandas-based 3rd party code that I need to execute as a part of a bigger spark pipeline. By nature, pandas-based code is executed on driver node. I ran into out of memory problems and started exploring the topic of monitoring driver node memory utilization.

My questions are:

1) I have an idle cluster with 56Gb of RAM, and when looking at new "Metrics" I see weird memory fluctuations/cycles. Where do these cycles come from? The cluster is not running any code (CPU util ~0%) so I am wondering what's going on? Driver memory cycles_2) My understanding is that orange "used" series is showing memory used by my python code. But what exactly is this greenish-blue area below called "other"? Even when I run my code, the vast majority of memory on 56GB of RAM driver node is occupied by this "other" stuff:

Busy clusterI dont belive that OS/Docker/Spark/JVM stuff takes 35-40 gigs of RAM. So what exactly is it? And how can I reduce it and make more "room" for my code?

3) How does spark.driver.memory setting affects all that? Accrding to spark docs, by default it is 1g. Is this the max amount of memory that I can use when running my code ("used" series)? 1g seems extremally low. Would it make sense to increase it to 8 or 16g for my scenario?

Thx!

7 REPLIES 7

Anonymous
Not applicable

Hi @Wojciech Jakubowski​ 

Great to meet you, and thanks for your question!

Let's see if your peers in the community have an answer to your question. Thanks.

tapash-db
New Contributor II

You can always reconfigure that and set it to higher size. 

spark.driver.maxResultSize 4g  --this will allocate 4GB of driver memory

Tharun-Kumar
Honored Contributor II

@wojciech_jakubo 

About the first question, driver memory utilization is high and we could see multiple cycles of high utlization. The primary reason behind this is, even if a cluster is idle, driver has to perform multiple operations to keep the cluster active and ready for processing. Some of the activities are

  • heart beat messages
  • gc
  • listening for job requests
  • hosting spark ui
  • monitoring resources

This happens in intervals and this is the reason behind memory utilization happening in cycles on the driver.

Hi Tharun-Kumar.

Thanks for your answer. I get that all these activities you listed are required for cluster to function correctly. But 40 gigs of ram for that? That looks way too much imo... especially that all these activates are also done on much smaller drivers that have 8 or 16 gigs of RAM... 

Tharun-Kumar
Honored Contributor II

@wojciech_jakubo 

About your third question, you can get to know the actual value of spark.driver.memory by looking at the executors tab in spark UI. This will also have the driver and we can get to know the actual value of the driver memory.

In the spark UI, we would be able to see only the storage memory. Execution memory will almost be equal to the storage memory. 

Screenshot 2023-08-15 at 11.15.31 AM.png

In this case, driver memory would be 21.4GB. This will be the amount of memory allocated to JVM related activities.

Hi,

 

How did you infer from this image that driver memory would be 21.4 GB? Shouldn't it be 10.4 GB?

 

Also, if a memory is allocated to JVM related activities, can this memory can be also utilized from python? Meaning if I have 21.4 gb for JVM, does it mean I can use all that memory from python (for instance to load some crazdy pandas dataframes)? 

Tharun-Kumar
Honored Contributor II

Hi @wojciech_jakubo 

1. JVM memory will not be utilized for python related activities. 

2. In the image we could only see the storage memory. We also have execution memory which would also be the same. Hence I came up with the executor memory to be of size 21.4GB

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group