4 hours ago
I tried to create configuration of Databricks with Vlan injection and I faced few problem during deploymen.
1. I tried to deploy my workspace using IaC and terraform. Whole time I face issue with NSG even when I create configuration as follow in this docs https://learn.microsoft.com/en-us/azure/databricks/security/network/classic/vnet-inject#nsg . On GUI I could use option and it works.
But terraform api do not contain this option ( I tries latest azure_rm ). How to walk-around that.
2. After deployment my compute cluster has problem with starting
Error message: [details] X_NHC_CONTROL_PLANE_SSL_ERROR: Instance failed network health check before bootstrapping with fatal error: X_NHC_CONTROL_PLANE_SSL_ERROR 2 failed component(s): control_plane internet Retryable: false Based on the failure results: List(entity: "adb-xxxxxxxxxxx.0.azuredatabricks.net" outcome: "ssl_error"
duration_sec: 241.42003 message: "curl: (35) OpenSSL SSL_connect: Connection reset by peer in connection to adb-xxxxxxxxxxx.0.azuredatabricks.net:443" last_error_code: 35 , entity: "www.databricks.com" outcome: "ssl_error" duration_sec: 223.5729 message: "curl: (35) OpenSSL SSL_connect: Connection reset by peer in connection to www.databricks.com:443" last_error_code: 35 )(OnDemand)
Azure error code: X_NHC_CONTROL_PLANE_SSL_ERROR Azure error message: Instance failed network health check before bootstrapping with fatal error: X_NHC_CONTROL_PLANE_SSL_ERROR 2 failed component(s): control_plane internet Retryable: false Based on the failure results:
List(entity: "adb-xxxxxxxxxxx.0.azuredatabricks.net" outcome: "ssl_error" duration_sec: 241.42003 message: "curl: (35) OpenSSL SSL_connect: Connection reset by peer in connection to adb-xxxxxxxxxxx.0.azuredatabricks.net:443" last_error_code: 35 , entity: "www.databricks.com" outcome: "ssl_error" duration_sec: 223.5729 message: "curl: (35) OpenSSL SSL_connect: Connection reset by peer in connection to www.databricks.com:443" last_error_code: 35 )(OnDemand) It's looks like I have problem with connection between control plane and workers. How to handle that?
Maybe you got some examples of NGS ,
My terrafrom manifest looks like:
resource "azurerm_databricks_access_connector" "connector" {
name = "dac-${var.name_of_workspace}"
resource_group_name = var.rg
location = var.location
identity {
type = "SystemAssigned"
}
}
resource "azurerm_databricks_workspace" "workspace" {
provider = azurerm
name = "dw-${var.name_of_workspace}"
resource_group_name = var.rg
location = var.location
sku = var.tier_of_databricks
managed_resource_group_name = "${var.name_of_workspace}-managed"
public_network_access_enabled = false
default_storage_firewall_enabled = false
access_connector_id = azurerm_databricks_access_connector.connector.id
custom_parameters {
virtual_network_id = var.vnet_id
public_subnet_name = var.subnet_name_public
private_subnet_name = var.subnet_name_private
public_subnet_network_security_group_association_id = var.public_nsg_id
private_subnet_network_security_group_association_id = var.private_nsg_id
}
tags = merge(local.default_tags,
{ module_version = var.module_version }
)
depends_on = [
azurerm_databricks_access_connector.connector
]
}
required_providers {
databricks = {
source = "databricks/databricks"
version = "~> 1.97.0"
}
azurerm = {
source = "hashicorp/azurerm"
version = "~>4.53.0"
}
}
Of course to handle connection I used private endpoint to auth and ui-api.
3 hours ago
Oh and I forget, I can not use nat gateway to outbound traffic.
3 hours ago
I investigated with my network team and workers not following traffic via private endpoint to my workspace but via public address. How to walkaround it.
2 hours ago
It seems to be an issue coming from your VNET Table Route configuration. Try to knowing exact reason is complex without being able to look into. Take a look here for details on how to confiigure: https://learn.microsoft.com/en-us/azure/databricks/security/network/classic/udr
51m ago
After opening traffic to public address of workspace my error change to
Error message: [details] X_NHC_CONTROL_PLANE_HTTP_ERROR: Instance failed network health check before bootstrapping with fatal error: X_NHC_CONTROL_PLANE_HTTP_ERROR 2 failed component(s): control_plane internet Retryable: false Based on the failure results: List(entity: "adb-xxxxxxxxxx.azuredatabricks.net" outcome: "http_error"
duration_sec: 282.8475 message: "Configured privacy settings disallow access for this workspace over your current network. Please contact your administrator for " last_error_code: 401 , entity: "www.databricks.com" outcome: "ssl_error" duration_sec: 226.39476 message: "curl: (35) OpenSSL SSL_connect: Connection reset by peer in connection to www.databricks.com:443" last_error_code: 35 )(OnDemand)workers are not resolving my workspace address to private endpoint. There is possibility to change that?
Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!
Sign Up Now