cancel
Showing results for 
Search instead for 
Did you mean: 
Community Discussions
Connect with fellow community members to discuss general topics related to the Databricks platform, industry trends, and best practices. Share experiences, ask questions, and foster collaboration within the community.
cancel
Showing results for 
Search instead for 
Did you mean: 

Long running jobs get lost

jenshumrich
New Contributor III

Hello,
I tried to schedule a long running job and surprisingly it does seem to neither terminate (and thus does not let the cluster shut down), nor continue running, even though the state is still "Running":

jenshumrich_0-1712742957610.png
But the truth is that the job has miserably failed:

jenshumrich_2-1712743008070.png

jenshumrich_3-1712743098546.png

Sadly thus the automatization is not working. Any hint would be appreciated

 

2 REPLIES 2

shan_chandra
Esteemed Contributor
Esteemed Contributor

@jenshumrich -  There is not much information to decipher. However, can you please check if you have enough parallelism built for the task to execute. (spark.sql.shuffle.partitions and the no.of cores on the cluster) to begin with

Lakshay
Esteemed Contributor
Esteemed Contributor

Have you looked at the sql plan to see what the  spark job 72 was doing?

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!