cluster start Issues

User16826994223 — Tue, 08 Jun 2021 11:54:32 GMT

Some of the Jobs are failing in prod with below error message:

Can you please check and let us know the reason for this? These are running under pool cluster.

Run result unavailable: job failed with error message

Unexpected failure while waiting for the cluster (0604-056775-teaks96) to be ready.Cause Unexpected state for cluster (0604-056775-teaks96): UNEXPECTED_LAUNCH_FAILURE(SERVICE_FAULT): databricks_error_message:Encountered unexpected failure on instance InstanceId(63901da48df74d539b078907d929527d), failure code: DEFUNCT_RESOURCE

message: "Defunct Resource Detected"

Re: cluster start Issues

Mooune_DBU — Fri, 18 Jun 2021 23:58:39 GMT

@Kunal Gaurav , This status code only occurs in one of two conditions:

We’re able to request the instances for the cluster but can’t bootstrap them in time
We setup the containers on each instance, but can’t start the containers in time

this is an edge case in our service cleanup logic that some containers/clusters might be mis-identified as zombie resources, but there is actually no problem at all, we are working on optimizing the classifying logic and should have a fix deployed soon.

That being said don't hesitate to create an Engineering Support ticket with the workspace id, cluser id and region so databricks can confirm if something similar has happened in the region.

topic Re: cluster start Issues in Data Engineering

cluster start Issues

Re: cluster start Issues