Driver is up but is not responsive, likely due to GC.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-13-2020 01:53 AM
Hi all,
"Driver is up but is not responsive, likely due to GC."
This is the message in cluster event logs. Can anyone help me with this. What does GC means? Garbage collection? Can we control it externally?
- Labels:
-
Gc
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-17-2020 12:59 PM
Hi @vamsivarun007,
Please go through the below KB article to resolve this issue,https://kb.databricks.com/jobs/driver-unavailable.htmlfor what is GC, please check this answer,https://forums.databricks.com/questions/14725/how-to-resolve-spark-full-gc-on-cluster-startup.html- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-25-2020 11:57 PM
spark.catalog.clearCache() solve the problem for me 😉
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-17-2024 06:56 PM
Hi, I meet the seme problem when I train a DeepLearning model. Could you tell me where to set this 'spark.catalog.clearCache()'? Thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-18-2024 12:12 AM
9/10 times GC is due to out of memory exceptions.
@Jaron spark.catalog.clearCache() is not a configurable option, but rather a command to submit.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-18-2024 12:22 AM
Thanks! But I'm running the python script file via workflow→jobs, so I can't submit "spark.catalog.clearCache()" via notebooks because they're isolated. Is there any way out of this situation?😭.
For another question, may I ask if the ''memory'' you mentioned is spark.executor.memory? My program is running with 64GB of computer memory which is large enough, but still this GC issue occurs. I checked the docs and they all mention that it could be that the ''spark.executor.memory'' is too small, but I don't know how to check and deal with it. (so tired 😫
Looking forward your reply, thanks !!!