- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-08-2021 04:54 AM
Some of the Jobs are failing in prod with below error message:
Can you please check and let us know the reason for this? These are running under pool cluster.
Run result unavailable: job failed with error message
Unexpected failure while waiting for the cluster (0604-056775-teaks96) to be ready.Cause Unexpected state for cluster (0604-056775-teaks96): UNEXPECTED_LAUNCH_FAILURE(SERVICE_FAULT): databricks_error_message:Encountered unexpected failure on instance InstanceId(63901da48df74d539b078907d929527d), failure code: DEFUNCT_RESOURCE
message: "Defunct Resource Detected"
- Labels:
-
Cluster
-
Cluster management
-
Error Message
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-18-2021 04:58 PM
@Kunal Gaurav , This status code only occurs in one of two conditions:
- We’re able to request the instances for the cluster but can’t bootstrap them in time
- We setup the containers on each instance, but can’t start the containers in time
this is an edge case in our service cleanup logic that some containers/clusters might be mis-identified as zombie resources, but there is actually no problem at all, we are working on optimizing the classifying logic and should have a fix deployed soon.
That being said don't hesitate to create an Engineering Support ticket with the workspace id, cluser id and region so databricks can confirm if something similar has happened in the region.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-18-2021 04:58 PM
@Kunal Gaurav , This status code only occurs in one of two conditions:
- We’re able to request the instances for the cluster but can’t bootstrap them in time
- We setup the containers on each instance, but can’t start the containers in time
this is an edge case in our service cleanup logic that some containers/clusters might be mis-identified as zombie resources, but there is actually no problem at all, we are working on optimizing the classifying logic and should have a fix deployed soon.
That being said don't hesitate to create an Engineering Support ticket with the workspace id, cluser id and region so databricks can confirm if something similar has happened in the region.

