Tuesday
Hello Community,
I am facing an intermittent issue while running a Databricks job. The job fails with the following error message:
Run failed with error message:
Could not reach driver of cluster <cluster-id>.
Here are some additional details:
Job Setup: This job runs a standard ETL notebook
Behavior:
Questions for the community:
Any guidance or troubleshooting tips would be highly appreciated.
Note: I attached the cluster log for reference
Tuesday - last edited Tuesday
Hello @sandeepsuresh16 ,
Below are the answers to your questions:
The error "Could not reach driver of cluster <cluster-id>" can occur due to several different reasons. Use the following troubleshooting steps to verify that the cause of your error matches any of the below:
Move from F-series (compute-optimized) to a memory-optimized driver (e.g., E/D-series) or at least a larger F node. Bump spark.driver.memory via node type, not just conf. Reduce collect()/toPandas() and any driver-side loops/UDF work
If you launch many notebooks/tasks at once, raise the REPL launch timeout (JobsโComputeโSpark config):
Please do let me know if you have any further questions
Thanks
yesterday
Hello Anudeep,
Thank you for your detailed response and the helpful recommendations.
I would like to provide some additional context:
Regarding your suggestion about changing to a memory-optimized driver series, thank you for the recommendation โ we will definitely consider this option.
Please let me know if there are any additional logs or metrics you would recommend checking in this specific scenario.
Thanks & Regards,
Sandeep
yesterday - last edited yesterday
Hi @sandeepsuresh16 ,
Check two below articles. In one of them they suggested metrics to check. Also, you will find there some suggestions on how to limit the occurrence of this problem.
Workflows are failing with a 'Could not reach driver of the cluster' error - Databricks
Job run fails with error message โCould not reach driver of clusterโ - Databricks
yesterday
You can follow the recommendations and also check the KB articles mentioned below by @szymon_dybczak. I think those should help you
Passionate about hosting events and connecting people? Help us grow a vibrant local communityโsign up today to get started!
Sign Up Now