cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

How to find out why the cluster is in PENDING state for so long?

ivanychev
Contributor

I'm using Databricks on AWS. Our clusters are typically in PENDING state for 5-8 minutes after they are created. I would like to find out why (ec2 instance provisioning? docker image download is slow? ...?). The cluster logs are not helpful enough because I only see the timestamps of init script execution which is in our case is ~2 seconds.

I'd like to improve startup times. How can I find out what takes so much time to launch the cluster? Is there some logging or event emitting that I can read and analyze?

1 ACCEPTED SOLUTION

Accepted Solutions

Prabakar
Esteemed Contributor III
Esteemed Contributor III

Unfortunately, this is not available either on the UI or via API. We get this information only from the backend logs. If you feel this information is required for some analysis, I would recommend raising a feature request for this via our ideas portal.

https://docs.databricks.com/resources/ideas.html

View solution in original post

5 REPLIES 5

Vivian_Wilfred
Honored Contributor
Honored Contributor

Hi @Sergey Ivanychev​ , Did you check if there are any legacy init-scripts loaded on the DBFS? Check for any scripts under dbfs:/databricks/init.

Nope, there's nothing

Prabakar
Esteemed Contributor III
Esteemed Contributor III

hi @Sergey Ivanychev​ while the cluster is starting, you can see the status on the compute page. Hover the mouse pointer to the green rotating circle on the left of the cluster name. It will give a notification of what is happening on the cluster. While I captured the screenshot, the cluster was finding for new nodes. This might help you to get some insights on why there is a delay.

image

That sounds promising, thanks! Is this data available via any API or logged anywhere?

Prabakar
Esteemed Contributor III
Esteemed Contributor III

Unfortunately, this is not available either on the UI or via API. We get this information only from the backend logs. If you feel this information is required for some analysis, I would recommend raising a feature request for this via our ideas portal.

https://docs.databricks.com/resources/ideas.html

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.