cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Deployment of private databricks workspace.

lubiarzm1
New Contributor II

I tried to create configuration of Databricks with Vlan injection and I faced few problem during deploymen.
1. I tried to deploy my workspace using IaC and terraform. Whole time I face issue with NSG even when I create configuration as follow in this docs https://learn.microsoft.com/en-us/azure/databricks/security/network/classic/vnet-inject#nsg . On GUI I could use option and it works.

lubiarzm1_0-1763718040279.png

But terraform api do not contain this option ( I tries latest azure_rm ). How to walk-around that.

2. After deployment my compute cluster has problem with starting

Error message: [details] X_NHC_CONTROL_PLANE_SSL_ERROR: Instance failed network health check before bootstrapping with fatal error: X_NHC_CONTROL_PLANE_SSL_ERROR 2 failed component(s): control_plane internet Retryable: false Based on the failure results: List(entity: "adb-xxxxxxxxxxx.0.azuredatabricks.net" outcome: "ssl_error" 
duration_sec: 241.42003 message: "curl: (35) OpenSSL SSL_connect: Connection reset by peer in connection to adb-xxxxxxxxxxx.0.azuredatabricks.net:443" last_error_code: 35 , entity: "www.databricks.com" outcome: "ssl_error" duration_sec: 223.5729 message: "curl: (35) OpenSSL SSL_connect: Connection reset by peer in connection to www.databricks.com:443" last_error_code: 35 )(OnDemand)  
Azure error code: X_NHC_CONTROL_PLANE_SSL_ERROR  Azure error message: Instance failed network health check before bootstrapping with fatal error: X_NHC_CONTROL_PLANE_SSL_ERROR 2 failed component(s): control_plane internet Retryable: false Based on the failure results: 
List(entity: "adb-xxxxxxxxxxx.0.azuredatabricks.net" outcome: "ssl_error" duration_sec: 241.42003 message: "curl: (35) OpenSSL SSL_connect: Connection reset by peer in connection to adb-xxxxxxxxxxx.0.azuredatabricks.net:443" last_error_code: 35 , entity: "www.databricks.com" outcome: "ssl_error" duration_sec: 223.5729 message: "curl: (35) OpenSSL SSL_connect: Connection reset by peer in connection to www.databricks.com:443" last_error_code: 35 )(OnDemand)

  It's looks like I have problem with connection between control plane and workers. How to handle that?

Maybe you got some examples of NGS , 
My terrafrom manifest looks like: 

resource "azurerm_databricks_access_connector" "connector" {
  name                = "dac-${var.name_of_workspace}"
  resource_group_name = var.rg
  location            = var.location

  identity {
    type = "SystemAssigned"
  }
}

resource "azurerm_databricks_workspace" "workspace" {
  provider                              = azurerm
  name                                  = "dw-${var.name_of_workspace}"
  resource_group_name                   = var.rg
  location                              = var.location
  sku                                   = var.tier_of_databricks
  
  managed_resource_group_name           = "${var.name_of_workspace}-managed"
  public_network_access_enabled         = false
  default_storage_firewall_enabled      = false
  access_connector_id                   = azurerm_databricks_access_connector.connector.id

  custom_parameters {
    virtual_network_id      = var.vnet_id
    public_subnet_name      = var.subnet_name_public
    private_subnet_name     = var.subnet_name_private

    public_subnet_network_security_group_association_id  = var.public_nsg_id
    private_subnet_network_security_group_association_id = var.private_nsg_id
  }

  tags = merge(local.default_tags,
    { module_version = var.module_version }
  )

  depends_on = [
    azurerm_databricks_access_connector.connector
  ]
}

 

  required_providers {
    databricks = {
      source = "databricks/databricks"
      version = "~> 1.97.0"
    }
    azurerm = {
      source  = "hashicorp/azurerm"
      version = "~>4.53.0"
    }
  }

 

Of course to handle connection I used private endpoint to auth and ui-api.

4 REPLIES 4

lubiarzm1
New Contributor II

Oh and I forget, I can not use nat gateway to outbound traffic. 

lubiarzm1
New Contributor II

I investigated with my network team and workers not following traffic via private endpoint to my workspace but via public address. How to walkaround it.

Coffee77
Contributor III

It seems to be an issue coming from your VNET Table Route configuration. Try to knowing exact reason is complex without being able to look into. Take a look here for details on how to confiigure: https://learn.microsoft.com/en-us/azure/databricks/security/network/classic/udr 


Lifelong Learner Cloud & Data Solution Architect | https://www.youtube.com/@CafeConData

lubiarzm1
New Contributor II

After opening traffic to public address of workspace my error change to 

Error message: [details] X_NHC_CONTROL_PLANE_HTTP_ERROR: Instance failed network health check before bootstrapping with fatal error: X_NHC_CONTROL_PLANE_HTTP_ERROR 2 failed component(s): control_plane internet Retryable: false Based on the failure results: List(entity: "adb-xxxxxxxxxx.azuredatabricks.net" outcome: "http_error" 
duration_sec: 282.8475 message: "Configured privacy settings disallow access for this workspace over your current network. Please contact your administrator for " last_error_code: 401 , entity: "www.databricks.com" outcome: "ssl_error" duration_sec: 226.39476 message: "curl: (35) OpenSSL SSL_connect: Connection reset by peer in connection to www.databricks.com:443" last_error_code: 35 )(OnDemand)

workers are not resolving my workspace address to private endpoint. There is possibility to change that?