Bootstrap timeout on instance creation
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-21-2023 11:42 AM - edited 07-21-2023 11:50 AM
I am getting the following error...
{ "reason": { "code": "BOOTSTRAP_TIMEOUT", "parameters": { "databricks_error_message": "[id: InstanceId(i-0e552e85c37c9da2d), status: INSTANCE_INITIALIZING, workerEnvId:WorkerEnvId(workerenv-1266184353274328-a3cf37c9-8da9-4828-8cff-aa094025aad5), lastStatusChangeTime: 1689896414325, groupIdOpt Some(0),requestIdOpt Some(0720-205452-ixn0b8ac-78df74c0-44b8-4eed-9),version 2] with threshold 700 seconds timed out after 708057 milliseconds. Please check network connectivity from the data plane to the control plane.", "instance_id": "i-0e552e85c37c9da2d" } }, "add_node_failure_details": { "failure_count": 1, "resource_type": "container", "will_retry": false } }
This is from the system log for that instance.
[Bootstrap Event] DNS output for databricks-prod-artifacts-us-east-1.s3.amazonaws.com: Server: 10.10.0.2 Address: 10.10.0.2#53 Non-authoritative answer: databricks-prod-artifacts-us-east-1.s3.amazonaws.com canonical name = s3-1-w.amazonaws.com. s3-1-w.amazonaws.com canonical name = s3-w.us-east-1.amazonaws.com. Name: s3-w.us-east-1.amazonaws.com Address: 52.217.164.81 Name: s3-w.us-east-1.amazonaws.com Address: 52.217.197.9 Name: s3-w.us-east-1.amazonaws.com Address: 52.217.236.193 Name: s3-w.us-east-1.amazonaws.com Address: 54.231.160.105 Name: s3-w.us-east-1.amazonaws.com Address: 54.231.166.201 Name: s3-w.us-east-1.amazonaws.com Address: 3.5.8.156 Name: s3-w.us-east-1.amazonaws.com Address: 52.216.112.83 Name: s3-w.us-east-1.amazonaws.com Address: 52.217.112.113 [Bootstrap Event] Can reach databricks-prod-artifacts-us-east-1.s3.amazonaws.com: [FAILED] [Bootstrap Event] DNS output for databricks-prod-artifacts-us-west-2.s3.us-west-2.amazonaws.com: Server: 10.10.0.2 Address: 10.10.0.2#53 Non-authoritative answer: databricks-prod-artifacts-us-west-2.s3.us-west-2.amazonaws.com canonical name = s3-r-w.us-west-2.amazonaws.com. Name: s3-r-w.us-west-2.amazonaws.com Address: 52.92.192.26 Name: s3-r-w.us-west-2.amazonaws.com Address: 52.92.194.242 Name: s3-r-w.us-west-2.amazonaws.com Address: 52.92.196.74 Name: s3-r-w.us-west-2.amazonaws.com Address: 52.92.209.98 Name: s3-r-w.us-west-2.amazonaws.com Address: 52.218.192.177 Name: s3-r-w.us-west-2.amazonaws.com Address: 3.5.82.203 Name: s3-r-w.us-west-2.amazonaws.com Address: 52.92.128.130 Name: s3-r-w.us-west-2.amazonaws.com Address: 52.92.138.58 [ 252.523636] audit: kauditd hold queue overflow [ 252.598843] audit: kauditd hold queue overflow [ 252.639047] audit: kauditd hold queue overflow [Bootstrap Event] Can reach databricks-prod-artifacts-us-west-2.s3.us-west-2.amazonaws.com: [FAILED] [Bootstrap Event] DNS output for databricks-update-oregon.s3.us-west-2.amazonaws.com: Server: 10.10.0.2 Address: 10.10.0.2#53 Non-authoritative answer: databricks-update-oregon.s3.us-west-2.amazonaws.com canonical name = s3-r-w.us-west-2.amazonaws.com. Name: s3-r-w.us-west-2.amazonaws.com Address: 52.218.133.106 Name: s3-r-w.us-west-2.amazonaws.com Address: 52.218.153.81 Name: s3-r-w.us-west-2.amazonaws.com Address: 52.218.196.49 Name: s3-r-w.us-west-2.amazonaws.com Address: 3.5.80.138 Name: s3-r-w.us-west-2.amazonaws.com Address: 3.5.84.111 Name: s3-r-w.us-west-2.amazonaws.com Address: 52.92.195.66 Name: s3-r-w.us-west-2.amazonaws.com Address: 52.92.210.114 Name: s3-r-w.us-west-2.amazonaws.com Address: 52.92.240.218 [Bootstrap Event] Can reach databricks-update-oregon.s3.us-west-2.amazonaws.com: [FAILED]
I was not able to ssh into the instance that was being created, so I started a new instance in the same security group.
ubuntu@ip-10-10-81-10:~$ nc -vz databricks-update-oregon.s3.us-west-2.amazonaws.com 443
Connection to databricks-update-oregon.s3.us-west-2.amazonaws.com (52.218.179.122) 443 port [tcp/https] succeeded!
ubuntu@ip-10-10-81-10:~$ nc -vz databricks-prod-artifacts-us-east-1.s3.amazonaws.com 443
Connection to databricks-prod-artifacts-us-east-1.s3.amazonaws.com (54.231.128.137) 443 port [tcp/https] succeeded!
ubuntu@ip-10-10-81-10:~$ nc -zv ireland.cloud.databricks.com 443
Connection to ireland.cloud.databricks.com (3.250.244.127) 443 port [tcp/https] succeeded!
ubuntu@ip-10-10-81-10:~$ nc -zv tunnel.eu-west-1.cloud.databricks.com 443
Connection to tunnel.eu-west-1.cloud.databricks.com (3.250.244.114) 443 port [tcp/https] succeeded!
ubuntu@ip-10-10-81-10:~$ nc -zv s3.amazonaws.com 443
Connection to s3.amazonaws.com (52.217.142.136) 443 port [tcp/https] succeeded!
ubuntu@ip-10-10-81-10:~$ nc -zv s3.eu-west-1.amazonaws.com 443
Connection to s3.eu-west-1.amazonaws.com (52.92.17.176) 443 port [tcp/https] succeeded!
ubuntu@ip-10-10-81-10:~$ nc -zv sts.amazonaws.com 443
Connection to sts.amazonaws.com (54.239.29.25) 443 port [tcp/https] succeeded!
ubuntu@ip-10-10-81-10:~$ nc -zv sts.eu-west-1.amazonaws.com 443
Connection to sts.eu-west-1.amazonaws.com (54.239.32.126) 443 port [tcp/https] succeeded!
ubuntu@ip-10-10-81-10:~$ nc -zv kinesis.eu-west-1.amazonaws.com 443
Connection to kinesis.eu-west-1.amazonaws.com (99.80.34.206) 443 port [tcp/https] succeeded!
ubuntu@ip-10-10-81-10:~$ nc -zv md15cf9e1wmjgny.cxg30ia2wqgj.eu-west-1.rds.amazonaws.com 3306
Connection to md15cf9e1wmjgny.cxg30ia2wqgj.eu-west-1.rds.amazonaws.com (54.73.70.178) 3306 port [tcp/mysql] succeeded!
ubuntu@ip-10-10-81-10:~$ nc -uzv 3.250.244.112 443
Connection to 3.250.244.112 443 port [udp/https] succeeded!
I additionally tried to allow for all TCP and UDP connections inbound/outbound for the security group and that failed as well. Looking for guidance on how I can deep dive this issue.
Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-20-2023 10:08 AM
Hello,
Thanks for contacting Databricks Support.
From the error message: [Bootstrap Event] Can reach databricks-prod-artifacts-us-east-1.s3.amazonaws.com: [FAILED]. It suggests an issue with reaching a Databricks-related AWS S3 bucket from your environment. The DNS output for databricks-update-oregon.s3.us-west-2.amazonaws.com
indicates that a DNS server at 10.10.0.2
is being queried. This type of issue can arise due to network configuration or connectivity problems.
You were able to establish a successful connection to databricks-update-oregon.s3.us-west-2.amazonaws.com on port 443 (HTTPS) from the new instance you created in the same security group. Considering this, we need to address the initial issue of not being able to SSH into the first instance:
- Since the new instance is accessible, compare its configuration with the first instance to identify any discrepancies.
- Verify that the route table associated with the subnet allows outbound traffic and has the appropriate routes for inbound SSH traffic.
- Check the network ACLs for rules that might be blocking inbound SSH traffic.
- Check if DNS hostname is enabled in the VPC.
Regards,

