Databricks Community

PKD28 · ‎09-05-2024

Jobs within the all purpose DB Cluster are failing with "the spark driver has stopped unexpectedly and is restarting. Your notebook will be automatically reattached"

In the event log it says "Event_type=DRIVER_NOT_RESPONDING & MESSAGE= "Driver is up but is not responsive, likely due to GC."

Please help me to fix this.

szymon_dybczak · ‎09-05-2024

Hi @PKD28 ,

One common cause for this error is that the driver is undergoing a memory bottleneck. When this happens, the driver crashes with an out of memory (OOM) condition and gets restarted or becomes unresponsive due to frequent full garbage collection. So, 9/10 times GC is due to out of memory exceptions. What you can try to do is to increase drivers memory first and see if that helps.

View solution in original post

szymon_dybczak · ‎09-05-2024

I would try to use driver with higher amount of memory, just to check if it will handle the load. So maybe I'll try to run a process on Standard_E20d_v4 or Standard_E32d_v4 (this one has 2x more RAM memory, so it should work)

View solution in original post

szymon_dybczak · ‎09-05-2024

Hi @PKD28 ,

One common cause for this error is that the driver is undergoing a memory bottleneck. When this happens, the driver crashes with an out of memory (OOM) condition and gets restarted or becomes unresponsive due to frequent full garbage collection. So, 9/10 times GC is due to out of memory exceptions. What you can try to do is to increase drivers memory first and see if that helps.

PKD28 · ‎09-05-2024

just now there is one cluster issue

cluster error: Driver is unresponsive likely due to GC

cluster conf:

worker: Standard_D8ads_v5

Driver: standard_E16d_v4

What do you suggest here ??

szymon_dybczak · ‎09-05-2024

I would try to use driver with higher amount of memory, just to check if it will handle the load. So maybe I'll try to run a process on Standard_E20d_v4 or Standard_E32d_v4 (this one has 2x more RAM memory, so it should work)

Databricks Community

Databricks Cluster job failure issue

Join Us as a Local Community Builder!

Solution Accelerator Series | #5 - Automating Product Review Summarization with LLMs

The next BrickTalks about the latest and greatest in AI/BI is scheduled for Oct 28!

🚀 Weekly Delta (8 - 14 October): A Look Back at This Week’s Top Community Highlights

BrickCon 2025 — Dec 3–5 | A Community Conference for Databricks Builders

🌟 Community Sparks of the Week | September 26 – October 2 🌟