Databricks Community

Data_Analytics1 · ‎02-07-2023

I am using MultiThread in this job which creates 8 parallel jobs. It fails for few times in a day and sometimes stuck in any of the Python notebook cell process. Here

The Python process exited with an unknown exit code.

The last 10 KB of the process's stderr and stdout can be found below. See driver logs for full logs.

---------------------------------------------------------------------------

Last messages on stderr:

Tue Feb 7 17:10:18 2023 Connection to spark from PID 24461

Tue Feb 7 17:10:18 2023 Initialized gateway on port 34499

Tue Feb 7 17:10:19 2023 Connected to spark.

---------------------------------------------------------------------------

Last messages on stdout:

NOTE: When using the `ipython kernel` entry point, Ctrl-C will not work.

To exit, you will have to explicitly quit this process, by either sending

"quit" from a client, or using Ctrl-\ in UNIX-like environments.

To read more about this, see https://github.com/ipython/ipython/issues/2049

deedstoke · ‎04-25-2023

We are also facing the same issue.

luis_herrera · ‎04-27-2023

Unresponsive Python notebooks or cancelled commands could be the cause of "Fatal error: The Python kernel is unresponsive." This can be caused by a number of problems, such as metastore connectivity issues or conflicting libraries. To troubleshoot this issue, you can check for metastore connectivity and review the Cluster cancels Python command execution due to library conflict KB article for more information:

https://kb.databricks.com/python-command-cancelled

luis_herrera · ‎04-28-2023

Hey, it seems that the issue is related to the driver undergoing a memory bottleneck, which causes it to crash with an out of memory (OOM) condition and gets restarted or becomes unresponsive due to frequent full garbage collection. The reason for the memory bottleneck can be, among others that 1) The driver instance type is not optimal for the load executed on the driver, 2) there are memory-intensive operations executed on the driver, or many notebooks or jobs are running in parallel on the same cluster. The solution varies from case to case, but the easiest way to resolve the issue in the absence of specific details is to increase the driver's memory. Other points to consider include avoiding memory-intensive operations like collect() operator, which brings a large amount of data to the driver, conversion of a large DataFrame to Pandas, and running batch jobs on a shared interactive cluster. It is also recommended to distribute the workloads into different clusters. (https://kb.databricks.com/jobs/driver-unavailable)

For more information on troubleshooting unresponsive Python notebooks or canceled commands, please refer to the Troubleshooting unresponsive Python notebooks or canceled commands article in the Databricks documentation.

PS: Check #DAIS2023 talks as well

Databricks Community

Fatal error: The Python kernel is unresponsive.

Connect with Databricks Users in Your Area

Databricks Learning Festival (Virtual): 15 January - 31 January 2025

Share Your Feedback in Our Community Survey

Databricks Named a Leader in the 2024 Gartner® Magic Quadrant™ for Cloud Database Management Systems

Milestone: DatabricksTV Reaches 100 Videos!

Announcing the new Meta Llama 3.3 model on Databricks