cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Databricks cluster start-up: Self Bootstrap Failure

mikaellognseth
New Contributor III

When attempting to deploy/start an Azure Databricks cluster through the UI, the following error consistently occurs:

{
  "reason": {
    "code": "SELF_BOOTSTRAP_FAILURE",
    "parameters": {
      "databricks_error_message": "Self-bootstrap failure during launch. Please try again later and contact Databricks if the problem persists. Node daemon fast failed and did not answer ping for instance a3488320d27a401fa679089b87410fab"
    }
  }
}

When looking at the resource group, I can see that the resources are being deployed, but as the self-bootstrap step fails, they are removed again.

We're using VNET injection with two /26 subnets for the Databricks workspace. Both subnets have the "Microsoft.Databricks/workspaces" service delegation and the following service delegation actions:

"Microsoft.Network/virtualNetworks/subnets/join/action",

"Microsoft.Network/virtualNetworks/subnets/prepareNetworkPolicies/action", "Microsoft.Network/virtualNetworks/subnets/unprepareNetworkPolicies/action"

All of the standard NSG rules are added and seem correct when comparing to the official Microsoft documentation.

We are using Terraform to deploy, I have added the relevant bits of the DBW deployment here.

1 ACCEPTED SOLUTION

Accepted Solutions

mikaellognseth
New Contributor III

Hi,

in our case the issue turned out to be DNS... As the DNS servers set on the Databricks workspace vnet are only available when peering the "management" vnet in our setup. Took a while to figure out as the error didn't exactly give a lot of clarity. 🙂

View solution in original post

9 REPLIES 9

Kaniz
Community Manager
Community Manager

Hi @Mikael Lognseth​, This issue can occur if you have some custom SSL settings that are preventing the data-plane from connecting to “tunnel.<region>.cloud.databricks.com”.

Solution: - You need to turn off SSL inspection or bypass the traffic to “cloud.databricks.com” on your firewall.

Kaniz
Community Manager
Community Manager

Hi @Mikael Lognseth​ , We haven’t heard from you on the last response from me, and I was checking back to see if my suggestions helped you. Or else, If you have any solution, please share it with the community as it can be helpful to others.

mikaellognseth
New Contributor III

Hi,

in our case the issue turned out to be DNS... As the DNS servers set on the Databricks workspace vnet are only available when peering the "management" vnet in our setup. Took a while to figure out as the error didn't exactly give a lot of clarity. 🙂

AJK1
New Contributor II

Hello.

I'm getting the same error as well. I'm also pretty certain that we setup the workspace correctly within our vNet, per Microsoft instructions. May I ask you - how did you specifically go about solving the DNS issue?

Dooley
Valued Contributor

@Andrew Kwiatkowski​  Did you also set up your workspace with Terraform? Did you never get a cluster to run in your new workspace or has it worked before and now it doesn't work?

AJK1
New Contributor II
Hi Sara. Apologies – I should have specified… we do not use Terraform. I was never able to actually start a cluster successfully, just define it.

Dooley
Valued Contributor

Have you resolved your DNS for the newly created Azure workspace? Here are the Azure instructions on how to troubleshoot your DNS.

Dooley
Valued Contributor

Also are you using Databricks PrivateLink?

AJK1
New Contributor II
Yes, we’re attempting to follow the steps outlined for PrivateLink.
Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.