Hi!
We had bunch of strange failures for our jobs during 28-29 of September.
Some jobs` runs could not start for some time (30-50 mins) and then were failed with an error:
Unexpected failure while waiting for the cluster (0929-002141-2zkekhdj) to be ready: Cluster 0929-002141-2zkekhdj is in unexpected state Terminated: BOOTSTRAP_TIMEOUT(SUCCESS): databricks_error_message:[id: InstanceId(i-0fc5420c47a8ec703), status: INSTANCE_INITIALIZING, workerEnvId:WorkerEnvId(workerenv-4325872309166545-0cd76812-3736-4d1f-aec9-e5663c7cfd13), lastStatusChangeTime: 1695946919780, groupIdOpt Some(0),requestIdOpt Some(0929-002141-2zkekhdj-ccdc6648-fd47-4587-a),version 1] with threshold 700 seconds timed out after 707904 milliseconds. Please check network connectivity from the data plane to the control plane.,instance_id:i-0fc5420c47a8ec703.
Also some jobs` runs were failed with this event:
Failed to add 16 containers to the compute. Will attempt retry: true. Reason: Container launch failure