Databricks Community

owen1 · ‎04-24-2023

I set the workflow to run at 12:00 every day in the workflow, but the workflow failed with the error message below, and I don't know why.

Run result unavailable: run failed with error message

Unexpected failure while waiting for the cluster (0506-023332-glkykcrs) to be ready: Cluster 0506-023332-glkykcrs is in unexpected state Terminated: UNEXPECTED_LAUNCH_FAILURE(SERVICE_FAULT): databricks_error_message:com.google.common.util.concurrent.UncheckedExecutionException: com.databricks.rpc.ReliableJettyClient$MaxRetriesExceededException: Max retries exhausted with RPC com.databricks.api.proto.central.GetCustomerStorageInfo, max retry count: 3

pvignesh92 · ‎04-25-2023

@Sangwoo Lee Hi, This seems to be an infra related issue. Please check the event logs to understand if the clusters were available to be provisioned from your Cloud provider. If this problem keep persisting, you can also try to choose an instance that is more commonly available in your Cloud region. We faced these kind of issues in All Purpose cluster spinning while requesting for high compute machines in AWS.

Murthy1 · ‎04-25-2023

Hello @Sangwoo Lee ,

As mentioned by vignesh, it seems like an infra related issue.

> Does the user (which executes the job) has access to start a cluster?

> Incase if it is not an access issue and Incase if you are starting a lot of workflow jobs together at the same time, try scheduling one job 5 minutes earlier ( just to start the cluster) - and schedule the remaining jobs to start together after 5 minutes. Idea is to have the cluster available already when the majority of the jobs need it.

Databricks Community

workflow cluster was create error

Photos

Connect with Databricks Users in Your Area

Databricks Learning Festival (Virtual): 15 January - 31 January 2025

Databricks Named a Leader in the 2024 Gartner® Magic Quadrant™ for Cloud Database Management Systems

Milestone: DatabricksTV Reaches 100 Videos!

Announcing the new Meta Llama 3.3 model on Databricks