cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

cluster start Issues

User16826994223
Honored Contributor III

Some of the Jobs are failing in prod with below error message:

Can you please check and let us know the reason for this? These are running under pool cluster.

Run result unavailable: job failed with error message

Unexpected failure while waiting for the cluster (0604-056775-teaks96) to be ready.Cause Unexpected state for cluster (0604-056775-teaks96): UNEXPECTED_LAUNCH_FAILURE(SERVICE_FAULT): databricks_error_message:Encountered unexpected failure on instance InstanceId(63901da48df74d539b078907d929527d), failure code: DEFUNCT_RESOURCE

message: "Defunct Resource Detected"

1 ACCEPTED SOLUTION

Accepted Solutions

Mooune_DBU
Valued Contributor

@Kunal Gaurav​ , This status code only occurs in one of two conditions:

  1. We’re able to request the instances for the cluster but can’t bootstrap them in time
  2. We setup the containers on each instance, but can’t start the containers in time

this is an edge case in our service cleanup logic that some containers/clusters might be mis-identified as zombie resources, but there is actually no problem at all, we are working on optimizing the classifying logic and should have a fix deployed soon.

That being said don't hesitate to create an Engineering Support ticket with the workspace id, cluser id and region so databricks can confirm if something similar has happened in the region.

View solution in original post

1 REPLY 1

Mooune_DBU
Valued Contributor

@Kunal Gaurav​ , This status code only occurs in one of two conditions:

  1. We’re able to request the instances for the cluster but can’t bootstrap them in time
  2. We setup the containers on each instance, but can’t start the containers in time

this is an edge case in our service cleanup logic that some containers/clusters might be mis-identified as zombie resources, but there is actually no problem at all, we are working on optimizing the classifying logic and should have a fix deployed soon.

That being said don't hesitate to create an Engineering Support ticket with the workspace id, cluser id and region so databricks can confirm if something similar has happened in the region.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.