cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Community Platform Discussions
Connect with fellow community members to discuss general topics related to the Databricks platform, industry trends, and best practices. Share experiences, ask questions, and foster collaboration within the community.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Python notebook crashes with "The Python kernel is unresponsive"

TalY
New Contributor II

While using a Python notebook that works on my machine it crashes on the same point with the errors "The Python kernel is unresponsive" and "The Python process exited with exit code 134 (SIGABRT: Aborted).",  but with no stacktrace for debugging the issue in the notebook output or in the databricks cluster's logs (and no memory spikes in the monitoring). What can I do to debug this issue?

5 REPLIES 5

TalY
New Contributor II

I have been using Ganglia UI but I didn't see the memory running out, is it the correct way for monitoring memory usage? are there more options?

sean_owen
Databricks Employee
Databricks Employee

This is almost surely OOM. Yes you use the Metrics tab in the cluster UI to see memory usage. However, you may not observe memory usage is high before OOM - maybe something is allocating a huge amount of memory at once.

I think 90% of these issues are resolvable by code inspection. What step fails? is it pulling a bunch of stuff to the driver? are you allocating a huge dataset?

TalY
New Contributor II

I did notice a couple of times log messages in the driver's logs about memory allocation failure, so I tried 2 things one is to use smaller dataframe (from 200k rows to 10k) and the second is optimizing the use with pandas which did not help. After some searching over the weekend, I found that adding the following lines prevent it from crashing:

logging.getLogger("py4j").setLevel(logging.ERROR)
logging.getLogger("py4j.java_gateway").setLevel(logging.ERROR)

And also I have been successfully running that notebook on my personal computer which has 32GB and the databricks driver is a "m5d.4xlarge" which has 64GB.

I prefer a cleaner solution ofc, so given this the direction of OOM is still the most probable one?

shan_chandra
Databricks Employee
Databricks Employee

@TalY  - Could you please let us know the DBR version used for running? Kindly try DBR 12.2 LTS or above. 

In order to debug this, there will be a hs_err_pid.log file provided with the problematic JVM details under the "python kernel unresponsive" error stack trace. 

TalY
New Contributor II

I am using the following DBR 12.2 LTS (includes Apache Spark 3.3.2, Scala 2.12).

Fatal error: The Python kernel is unresponsive.
--------------------------------------------------------------------------- The Python process exited with exit code 134 (SIGABRT: Aborted). --------------------------------------------------------------------------- The last 10 KB of the process's stderr and stdout can be found below. See driver logs for full logs. --------------------------------------------------------------------------- Last messages on stderr: Tue Aug 1 18:02:57 2023 Connection to spark from PID 2632 Tue Aug 1 18:02:57 2023 Initialized gateway on port 45165 Tue Aug 1 18:02:57 2023 Connected to spark. [IPKernelApp] WARNING | No such comm: LSP_COMM_ID [IPKernelApp] WARNING | No such comm: LSP_COMM_ID [IPKernelApp] WARNING | No such comm: LSP_COMM_ID [2023-08-01 18:06:22,007] [INFO] Received command c on object id p0 [2023-08-01 18:06:22,030] [INFO] Received command c on object id p0

And:
Last messages on stdout: NOTE: When using the `ipython kernel` entry point, Ctrl-C will not work. To exit, you will have to explicitly quit this process, by either sending "quit" from a client, or using Ctrl-\ in UNIX-like environments. To read more about this, see https://github.com/ipython/ipython/issues/2049

Those log lines led me in the direction of changing the log level

 

 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group