- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-01-2022 11:01 AM
Error: Please check network connectivity from the data plane to the control plane.
{
"reason": {
"code": "BOOTSTRAP_TIMEOUT",
"parameters": {
"databricks_error_message": "[id: InstanceId(i-0457092c), status: INSTANCE_INITIALIZING, workerEnvId:WorkerEnvId(workerenv-1118642488485491-ec0d76eb-7d7f-4589), lastStatusChangeTime: 1643738776621, groupIdOpt None,requestIdOpt Some(0201-162313-sfd3cke4),version 0] with threshold 700 seconds timed out after 701101 milliseconds. Please check network connectivity from the data plane to the control plane.",
"instance_id": "i-0457092d46c635b7a"
}
}
}
- Labels:
-
Cluster
-
Clusters
-
Control Plane
-
New Workspace
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-09-2022 02:46 AM
Can you please get the system logs from AWS EC2 console as soon the cluster fails - System Logs for the failed instance will be accessible from the AWS console up to an hour after the shutdown.
AWS console clears the references of terminated clusters after that.
Please find below doc on how to collect system logs,
1. Open the Amazon EC2 console at https://console.aws.amazon.com/ec2/.
2. In the left navigation pane, choose Instances, and select the instance using the instance-id (the instance-id, which starts with i-xxxxxx, will be printed in the “Event Log” section of the cluster details page in Databricks workspace. Notice that the instance must be terminated within last hour, otherwise it will not show up in the list. If the cluster creation failure happened a long time ago, please restart the cluster to reproduce the error first.)
3. Choose Actions > Monitor and troubleshoot > Get System Log.
Here i have taken eu-west -1 as sample you can change according to your region.
# verify access to the webapp
nc -zv ireland.cloud.databricks.com 443
# verify access to the tunnel
nc -zv tunnel.eu-west-1.cloud.databricks.com 443
# verify S3 global and regional access
nc -zv s3.amazonaws.com 443
nc -zv s3.eu-west-1.amazonaws.com 443
# verify STS global and regional access
nc -zv sts.amazonaws.com 443
nc -zv sts.eu-west-1.amazonaws.com 443
# verify regional kinesis access
nc -zv kinesis.eu-west-1.amazonaws.com 443
# verify metastore access
nc -zv md15cf9e1wmjgny.cxg30ia2wqgj.eu-west-1.rds.amazonaws.com 3306
# control plane infra CIDR range check (verify with docs page for ip range)
nc -uzv 3.250.244.112 443
please go through the below documents too
https://docs.databricks.com/administration-guide/account-api/aws-storage.html
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-01-2022 01:48 PM
Since you are seeing "BOOTSTRAP_TIMEOUT" issue in a new workspace, you need to make changes in your AWS network config. If your workspace is configured with customer-managed VPC, then please check if routes are valid, NAT gateway, and IGW are configured correctly as well. To further troubleshoot, you can deploy an EC2 instance in the Databricks data plane subnet and try to reach internet.
https://docs.databricks.com/administration-guide/cloud-configurations/aws/customer-managed-vpc.html
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-09-2022 02:46 AM
Can you please get the system logs from AWS EC2 console as soon the cluster fails - System Logs for the failed instance will be accessible from the AWS console up to an hour after the shutdown.
AWS console clears the references of terminated clusters after that.
Please find below doc on how to collect system logs,
1. Open the Amazon EC2 console at https://console.aws.amazon.com/ec2/.
2. In the left navigation pane, choose Instances, and select the instance using the instance-id (the instance-id, which starts with i-xxxxxx, will be printed in the “Event Log” section of the cluster details page in Databricks workspace. Notice that the instance must be terminated within last hour, otherwise it will not show up in the list. If the cluster creation failure happened a long time ago, please restart the cluster to reproduce the error first.)
3. Choose Actions > Monitor and troubleshoot > Get System Log.
Here i have taken eu-west -1 as sample you can change according to your region.
# verify access to the webapp
nc -zv ireland.cloud.databricks.com 443
# verify access to the tunnel
nc -zv tunnel.eu-west-1.cloud.databricks.com 443
# verify S3 global and regional access
nc -zv s3.amazonaws.com 443
nc -zv s3.eu-west-1.amazonaws.com 443
# verify STS global and regional access
nc -zv sts.amazonaws.com 443
nc -zv sts.eu-west-1.amazonaws.com 443
# verify regional kinesis access
nc -zv kinesis.eu-west-1.amazonaws.com 443
# verify metastore access
nc -zv md15cf9e1wmjgny.cxg30ia2wqgj.eu-west-1.rds.amazonaws.com 3306
# control plane infra CIDR range check (verify with docs page for ip range)
nc -uzv 3.250.244.112 443
please go through the below documents too
https://docs.databricks.com/administration-guide/account-api/aws-storage.html