Databricks Community

DejanSunderic · ‎08-04-2016

I created some ETL using DataFrames in python. It used to run 180 sec. But it is not taking ~ 1200 sec. I have been changing it, so it could be something that I introduced, or something in the environment.

Part of the process is appending results into a file on S3.

I a looking at Apache Jobs and I cannot see that any of them is active.

While I was writing this, I got: org.apache.spark.SparkException: Job aborted.

Command took 1274.63s -- by xxxxxxxx@gmail.com

at 8/4/2016, 12:44:17 PM on def4 (150 GB)

I have attached output that I got:

command-output.txt

I assume that I should be able to see in Spark UI what is active. I was surprised that Active Tasks on all executors was 0. Should I look at something else?

I tried to restart the cluster, but it was the same before and after. I used the same version of Spark 1.6.2 (Hadoop 2).

DejanSunderic · ‎08-04-2016

While I waiting for some response (I had lunch and then) I decided to do something else on this notebook, so I cloned it...

I have some initialization code in the notebook. It was taking 60 sec before and after cloning 1.4 sec. Wow!

Did you (Databricks support) do something on the cluster?

I am going to run my etl command.

It was running very fast and then it got "stuck" again. I do not see any Spark job running.

DejanSunderic · ‎08-04-2016

In the meanwhile I got an idea to look into driver log. I've found this:

2016-08-04T19:19:57.980+0000: [GC (Allocation Failure) [PSYoungGen: 6827008K->52511K(7299584K)] 7660819K->886330K(22848000K), 0.0142959 secs] [Times: user=0.08 sys=0.01, real=0.01 secs]

...

04T19:27:03.294+0000: [GC (Allocation Failure) [PSYoungGen: 7270001K->134234K(7454208K)] 8103861K->968093K(23002624K), 0.0509207 secs] [Times: user=0.33 sys=0.00, real=0.05 secs]

DejanSunderic · ‎08-04-2016

the process finally finished after 3600 sec (3x slower then long duration that i was complaining about).

DejanSunderic · ‎08-05-2016

Today at some point I created new cluster again.

Suddenly everything got much faster. It is back to 270 - 330 sec.

My question still stands - how do I know what is server doing/why is it slow/stuck?

btw, how long does it take to moderate question?

amanpreetkaur · ‎03-04-2019

Was this issue resolved? I'm also getting the same problem on my spark cluster.

NickStudenski · ‎10-23-2019

I have a similar issue. Several times per week I experience very slow (5 minutes +) of "running command" on a cell that should take sub 1 second to execute. It usually solves the problem to restart the cluster, but still a major inconvenience.

datadro · ‎12-06-2019

Check for GC (garbage collection) errors in standard out for the cluster.

https://databricks.com/blog/2015/05/28/tuning-java-garbage-collection-for-spark-applications.html

NickStudenski · ‎01-14-2020

I am getting this same issue. Occasionally a cell will display "Running Command" for as long as an hour. This can happen even for simple commands that ordinarily run in less than a second. I have tried restarting the cluster, attaching to a different cluster. Nothing seems to help.

sandeep8530 · ‎04-19-2020

Hi,

Facing same issue. Does anyone found the solution?

Risingi · ‎05-19-2020

Mm, probably yes

Carneiro · ‎04-28-2022

I am having a problem very similar.

Since yesterday, without a known reason, some commands that used to run daily are now stuck in a "Running command" state. Commands like:

dataframe.show(n=1)

dataframe.toPandas()

dataframe.description()

dataframe.write.format("csv").save(location)

are now stuck also for quite small dataframes with 28 rows and 5 columns, for example. I would appreciate any help since the problem is also in important daily jobs.

Databricks Community

is command stuck?

Connect with Databricks Users in Your Area

Databricks Learning Festival (Virtual): 15 January - 31 January 2025

Milestone: DatabricksTV Reaches 100 Videos!

Announcing the new Meta Llama 3.3 model on Databricks

Databricks Community Champion - December 2024 - Sujesh Menon

Dotmatics and Databricks Partner to Advance Scientific Intelligence in Life Sciences