cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Bootstrap Timeout during cluster start

LearningDatabri
Contributor II

sometimes while starting a cluster I am facing bootstrap timeout error, what is the reason? when I try the next time it starts the cluster.

1 ACCEPTED SOLUTION

Accepted Solutions

Prabakar
Databricks Employee
Databricks Employee

We are aware of this issue and our engineering team made a fix for this issue. The fix will be deployed this week.

View solution in original post

8 REPLIES 8

Prabakar
Databricks Employee
Databricks Employee

In which region your workspace is deployed? Also which is the cloud provider? We recently applied a fix to resolve this bootstrap error.

LearningDatabri
Contributor II

AWS useast1

Prabakar
Databricks Employee
Databricks Employee

We are aware of this issue and our engineering team made a fix for this issue. The fix will be deployed this week.

Hi - We have seen this issue on old workspace ( AWS Oregon us-west-2) , couple of times - 1. 26th July and 2. Today 29th July

Cluster '0729-040008-6c2diu36' was terminated. Reason: BOOTSTRAP_TIMEOUT (SERVICE_FAULT). Parameters: databricks_error_message:[id: InstanceId(i-0f5f6b6bd5d37a52a), status: INSTANCE_INITIALIZING, workerEnvId:WorkerEnvId(workerenv-3325107197771046-7201c2b6-6e0c-4e92-9792-2eaf55bc5ffd), lastStatusChangeTime: 1722225709727, groupIdOpt Some(-7472147180756461083),requestIdOpt Some(0729-040008-6c2diu36-b8587b40-d987-4795-a),version 3] with threshold 700 seconds timed out after 700133 milliseconds. Please check network connectivity from the data plane to the control plane., instance_id:i-0f5f6b6bd5d37a52a.

Bert613
New Contributor II

Hello, we have started to see this issue in Canada Central, on Azure, mostly around 6:15am.

Manish_Pundir
New Contributor II

By adding this property on cluster level, it started working.

spark.sql.legacy.timeParserPolicy to LEGACY

M_waleed
New Contributor II

The bootstrap timeout error happens when the cluster setup process takes too long, usually due to network delays, resource allocation issues, or communication problems with the cloud provider. Itโ€™s likely a temporary glitch, which is why retrying works the next time. If it keeps happening, check your cluster configurations or contact support for troubleshooting.

PabloCSD
Contributor III

I'm having this issue too (and also a colleague) in the eastus region:

 

 

run failed with error message
 Cluster '1212-135649-a2x5p6jg' was terminated. Reason: BOOTSTRAP_TIMEOUT (SERVICE_FAULT). Parameters: databricks_error_message:Bootstrap Timeout. Please try again later.
Instance bootstrap inferred timeout reason: UnknownReason
Failure message: Bootstrap script took too long and timeout. 
VM extension code: ProvisioningState/succeeded
instanceId: InstanceId(022733c3085c4460ade4fa8042d12c85)
workerEnv: workerenv-6505902102920425
Additional details (may be truncated):

 

 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group