cancel
Showing results for 
Search instead for 
Did you mean: 
Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.
cancel
Showing results for 
Search instead for 
Did you mean: 

Azure Databricks with VNET injection and SCC

Mendi
New Contributor

Hi,

Azure databricks with VNET injection and SCC need to communicate with Azure endpoints for following,

Metastore, artifact Blob storage, system tables storage, log Blob storage, and Event Hubs endpoint IP addresses.

https://learn.microsoft.com/en-us/azure/databricks/resources/ip-domain-region

Couple of questions on above,

1) Data plane clusters need access to Metastore endpoint on port 3306 only if Databricks managed hive metastore is used. If external hive metastore on cloud is used then compute needs connectivity to that metastore? And how about if Unity catalog is used then does compute need to connect on mysql endpoint on port 3306 as defined in above link?

e.g. West Europe

consolidated-westeurope-prod-metastore.mysql.database.azure.com
consolidated-westeurope-prod-metastore-addl-1.mysql.database.azure.com

2) About Event hub endpoints on port 9093, is it must to have this connection open and what sort of use case id behind it?

e.g. prod-westeurope-observabilityeventhubs.servicebus.windows.net

Thanks

1 REPLY 1

Louis_Frolio
Databricks Employee
Databricks Employee

Hey @Mendi ,  Here’s how connectivity works for Azure Databricks with VNet injection and Secure Cluster Connectivity (SCC) for the endpoints you listed.

 

Key points from the Microsoft Learn reference

  • The page lists, per region, the FQDNs and ports for the workspace-level Hive metastore, artifact Blob storage, system tables storage, log Blob storage, and the Event Hubs endpoint that clusters must reach when you manage egress with UDRs/firewalls. IPs change, so allowlist the FQDNs and resolve IPs automatically if you must use IPs.
  • Databricks advises using the Azure Databricks service tag rather than pinned IPs to avoid outages when addresses change.

1) Metastore connectivity: Databricks-managed Hive metastore vs external Hive metastore vs Unity Catalog

  • For the Databricks-managed Hive metastore, clusters need outbound access on port 3306/TCP to the regional MySQL FQDNs listed in the table (for West Europe: consolidated-westeurope-prod-metastore.mysql.database.azure.com and consolidated-westeurope-prod-metastore-addl-1.mysql.database.azure.com, among others).
  • If you use an external Hive metastore, then your clusters must be able to reach the external metastore you configured (host + port) because the metastore client connects directly via JDBC in “local mode”; the external metastore doc shows configuring javax.jdo.option.ConnectionURL to your MySQL host:port, which implies network connectivity from the cluster to that endpoint.
  • When Unity Catalog is used, compute relies on system tables storage (the dfs.core.windows.net endpoint) over HTTPS (443) and not on the MySQL-based Hive metastore endpoints; the MySQL endpoints in the page are for the legacy Hive metastore, while UC’s workspace-level system tables storage is explicitly listed in the table as HTTPS-only.
  • Azure’s VNet-injection guidance also notes that outbound NSG rules must permit ports 443, 3306, and 8443–8451 for service operation in injected workspaces; 3306 is specifically relevant to Hive metastore connectivity (managed or external), not Unity Catalog by itself.

West Europe examples (from the table)

  • Metastore (MySQL, 3306/TCP): consolidated-westeurope-prod-metastore.mysql.database.azure.com, consolidated-westeurope-prod-metastore-addl-1.mysql.database.azure.com (and additional listed FQDNs)
  • System tables storage (HTTPS, 443): ucstprdwesteu.dfs.core.windows.net
  • Artifact storage (HTTPS, 443): dbartifactsprodwesteu.blob.core.windows.net (plus additional artifact endpoints)
  • Log Blob storage (HTTPS, 443): dblogprodwesteurope.blob.core.windows.net

2) Event Hubs endpoints on 9093/TCP: required and why

  • The page lists a regional Event Hubs endpoint FQDN (for West Europe: prod-westeurope-observabilityeventhubs.servicebus.windows.net) with port 9093/TCP; if you control egress via UDRs/firewalls, you should allow outbound to this FQDN and port so Databricks services can operate correctly in VNet-injected/SCC environments.
  • Port 9093 is the Kafka-compatible port exposed by Azure Event Hubs, which Databricks uses with Kafka clients; official Databricks instructions for Event Hubs/Kafka explicitly configure bootstrap.servers with :9093 and describe 9093 as the Event Hubs Kafka port.
  • The endpoint naming includes “observabilityeventhubs,” and Databricks documents and examples show Event Hubs used for streaming/telemetry via Kafka-compatible endpoints, which is why the platform publishes this endpoint and port in the regional table.
 

Practical guidance

  • Prefer the Azure Databricks service tag in NSGs/UDRs; if your policy requires explicit FQDN/IP allowlisting, use the table’s FQDNs and resolve IPs periodically because these can change.
  • If you’re on Unity Catalog and don’t use the legacy Hive metastore, ensure HTTPS (443) to the region’s UC system tables storage and the artifact/log storage FQDNs; you don’t need the MySQL 3306 endpoints for UC-only metadata paths.
  • If you use the Databricks-managed Hive metastore, open 3306/TCP to the regional MySQL metastore FQDNs listed; same applies for an external Hive metastore—open to your external metastore host:port and follow the external metastore configuration doc.
  • Keep Event Hubs 9093/TCP open to the regional “observabilityeventhubs” FQDN to ensure Kafka-compatible Event Hubs interactions and platform observability paths function.
 
Cheers, Louis.