Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
I am using MultiThread in this job which creates 8 parallel jobs. It fails for few times in a day and sometimes stuck in any of the Python notebook cell process. Here The Python process exited with an unknown exit code.The last 10 KB of the process's...
Hey, it seems that the issue is related to the driver undergoing a memory bottleneck, which causes it to crash with an out of memory (OOM) condition and gets restarted or becomes unresponsive due to frequent full garbage collection. The reason for th...
I am running a parameterized autoloader notebook in a workflow.This notebook is being called 29 times in parallel, and FYI UC is also enabled.I am facing this error:java.lang.Exception: Unable to start python kernel for ReplId-79217-e05fc-0a4ce-2, ke...
@Harsh Paliwal :The error message suggests that there might be a conflict with the xtables lock.One thing you could try is to add the -w option as suggested by the error message. You can add the following command to the beginning of your notebook t...
I am running jupyter notebook on a cluster with configuration: 12.2 LTS (includes Apache Spark 3.3.2, Scala 2.12)Worker type: i3.xlarge 30.5gb memory, 4 coresMin 2 and max 8 workers cursor = conn.cursor()
cursor.execute(
"""
...
Hi, Could you please confirm the usage of your cluster while running this job? you can monitor the performance here: https://docs.databricks.com/clusters/clusters-manage.html#monitor-performance with different metrics. Also, please tag @Debayan with...
i have submitted around 90 job at a time to databricks, the job was running continuously for 2 hours after that i am getting fatal error Pyhon kernel is unresponsive.I am using Databricks runtime version : 11.2Cluster Configuration Details are given...
Hi @Dhanaraj Jogihalli,Just a friendly follow-up. Did any of the responses help you to resolve your question? if it did, please mark it as best. Otherwise, please let us know if you still need help.
Hi all, I am running a preprocessing to create my trainset and test set. Does anyone know why during the execution my cell gives the error "RuntimeException: The python kernel is unresponsive." ? How can I solve it?
Hey there @Valerio Goretti Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from...
Hey guys, I'm using petastorm to train DNN, First i convert spark df with make_spark_convertor and then open a reader on the materialized dataset.While i start training session only on subset of the data every thing works fine but when I'm using all...
Same error. This started a few days ago on notebooks that used to run fine in the past. Now, I cannot finish a notebook.I have already disabled almost all output being streamed to the result buffer, but the problem persists. I am left with <50 lines ...
I am working in jupyter hub in a notebook. I am using pyspark dataframe for analyzing text. More precisely I am doing sentimment analysis of newspaper articles. The code works until I get to some point where the kernel is busy and after approximately...
do you actually run the code on a distributed environment (meaning a driver and multiple workers)?If not, there is no use in using pyspark as all code will be executed locally.