Databricks cluster start-up: Self Bootstrap Failure

mikaellognseth
New Contributor III

When attempting to deploy/start an Azure Databricks cluster through the UI, the following error consistently occurs:

{
  "reason": {
    "code": "SELF_BOOTSTRAP_FAILURE",
    "parameters": {
      "databricks_error_message": "Self-bootstrap failure during launch. Please try again later and contact Databricks if the problem persists. Node daemon fast failed and did not answer ping for instance a3488320d27a401fa679089b87410fab"
    }
  }
}

When looking at the resource group, I can see that the resources are being deployed, but as the self-bootstrap step fails, they are removed again.

We're using VNET injection with two /26 subnets for the Databricks workspace. Both subnets have the "Microsoft.Databricks/workspaces" service delegation and the following service delegation actions:

"Microsoft.Network/virtualNetworks/subnets/join/action",

"Microsoft.Network/virtualNetworks/subnets/prepareNetworkPolicies/action", "Microsoft.Network/virtualNetworks/subnets/unprepareNetworkPolicies/action"

All of the standard NSG rules are added and seem correct when comparing to the official Microsoft documentation.

We are using Terraform to deploy, I have added the relevant bits of the DBW deployment here.