cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Bootstrap Timeout during cluster start on AWS cloud

Manimkm08
New Contributor III

Sometimes am getting the below error when the cluster is started. Have attached the system log of the below mentioned instance from AWS. In recent days am getting this error for very frequently. Have seen same error is reported earlier and it marked as resolved by the team.

Error:

{

"reason": {

"code": "BOOTSTRAP_TIMEOUT",

"parameters": {

"databricks_error_message": "[id: InstanceId(i-0a2c7c58a6ffcb69f), status: INSTANCE_INITIALIZING, workerEnvId:WorkerEnvId(workerenv-1009608583880808-17b7de8b-3026-44d9-ad08-2cc4776d9067), lastStatusChangeTime: 1667548831248, groupIdOpt Some(-8704258090982298271),requestIdOpt Some(1104-080005-ioy9s8op-62e7ec83-2289-45b8-8),version 2] with threshold 700 seconds timed out after 700519 milliseconds. Please check network connectivity from the data plane to the control plane.",

"instance_id": "i-0a2c7c58a6ffcb69f"

}

}

}

6 REPLIES 6

karthik_p
Esteemed Contributor

@Mani Srini​ Is this error you are seeing is new environment that was configured or it's working. any changes happened in cloud vpc end. looks data plane to control plane connectivity issues, please make sure all required ports are opened to communicate between data bricks control plane and customer data plane

Kaniz
Community Manager
Community Manager

Hi @Mani Srini​ , We haven’t heard from you since the last response from @karthik p​​, and I was checking back to see if his suggestions helped you.

Or else, If you have any solution, please share it with the community, as it can be helpful to others.

Also, Please don't forget to click on the "Select As Best" button whenever the information provided helps resolve your question.

Manimkm08
New Contributor III

@karthik p​ @Kaniz Fatma​ Didn't see the error on the past three days and didn't do any changes as well on my end. Am wondering why it's happened earlier and not in recent days. Can you please guide me what are the required ports need to be opened to avoid this issue in future.

Manimkm08
New Contributor III

@Kaniz Fatma​ @karthik p​  Since morning, am facing the issue again. Seems the issue is intermittent and it fails the pipeline in mid of the ETL process. Couldn't able to get the exact root cause of the issue. Can someone provide what would be the workaround or solution for it.

karthik_p
Esteemed Contributor

@Mani Srini​ mostly this kind issue, you can clear by validating security groups and routine table config for VPC that has been used for Databricks Instance. please validate if all pre-requisites in this has been met. mainly please check sunnet route table config and security groups Customer-managed VPC | Databricks on AWS

imageimage

Kaniz
Community Manager
Community Manager

Hi @Mani Srini​ , We haven’t heard from you since the last response from @karthik_p (Customer)​​, and I was checking to see if his suggestions helped you.

Or else, If you have any solution, please share it with the community, as it can be helpful to others.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.