โ02-01-2023 02:59 AM
Failure starting repl. Try detaching and re-attaching the notebook.
java.lang.Exception: Python repl did not start in 30 seconds.
at com.databricks.backend.daemon.driver.IpykernelUtils$.startIpyKernel(JupyterDriverLocal.scala:1442)
at com.databricks.backend.daemon.driver.JupyterDriverLocal.startPython(JupyterDriverLocal.scala:1083)
at com.databricks.backend.daemon.driver.JupyterDriverLocal.<init>(JupyterDriverLocal.scala:624)
at com.databricks.backend.daemon.driver.PythonDriverWrapper.instantiateDriver(DriverWrapper.scala:723)
at com.databricks.backend.daemon.driver.DriverWrapper.setupRepl(DriverWrapper.scala:342)
at com.databricks.backend.daemon.driver.DriverWrapper.run(DriverWrapper.scala:231)
at java.lang.Thread.run(Thread.java:750)
โ02-01-2023 03:26 AM
โ02-01-2023 12:02 PM
Hi @Mahesh Chahareโ , check if the cluster is overloaded. This can happen if there are too many REPLs being started because of too many processes.
โ02-02-2023 11:37 PM
@Vivian Wilfredโ Previously 20 jobs were running on one worker node. Now I reduced number of the jobs to 9 and increased the number of workers to 5. I am not getting the REPL error now. But I am getting TimeoutException: Futures timed out after [5 seconds] error and Fatal error: The Python kernel is unresponsive error. I was getting these error in my previous run too. REPL is resolved.
โ02-02-2023 05:21 AM
Hi @Mahesh Chahareโ , this issue usually happens when there are many parallel tasks running in your job with each task trying to open a python REPL. If this is the case for you, please try reducing the number of parallel tasks or increase the driver's memory
โ02-02-2023 11:38 PM
@Lakshay Goelโ Previously 20 jobs were running on one worker node. Now I reduced number of the jobs to 9 and increased the number of workers to 5. I am not getting the REPL error now. But I am getting TimeoutException: Futures timed out after [5 seconds] error and Fatal error: The Python kernel is unresponsive error. I was getting these error in my previous run too. REPL is resolved.
โ02-03-2023 04:02 AM
Hi @Mahesh Chahareโ , are you using Azure Eventhubs and also if you could tell the DBR version you are working on?
โ02-07-2023 02:37 AM
Hi @Lakshay Goelโ, First job is using EnevtHub and second job is creating 8 parallel jobs inside it (Second job).
DBR version: 11.3 LTS (includes Apache Spark 3.3.0, Scala 2.12)
โ02-07-2023 03:50 AM
Hi @Mahesh Chahareโ , The two issues are unrelated.
โ04-08-2023 08:11 PM
Hi @Mahesh Chahareโ
Thank you for posting your question in our community! We are happy to assist you.
To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your question?
This will also help other community members who may have similar questions in the future. Thank you for your participation and let us know if you need any further assistance!
yesterday
I have had this problem many times, today I made a copy of the cluster and it got "de-saturated", it could help someone in the future
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโt want to miss the chance to attend and share knowledge.
If there isnโt a group near you, start one and help create a community that brings people together.
Request a New Group