cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

External connectivity to VNET Injected & SCC enabled workspace

mat723
New Contributor II

Hi All,

We have a Databricks installation that is in our internal azure vnets (using vnet injection). And secure cluster connectivity is enabled as well. 

We have couple of external partners who wants to connect to our workspaces using APIs and JDBC. What are the suggested options to setup/allow same for Azure Databricks? 

Thanks 

3 REPLIES 3

SteveOstrowski
Databricks Employee
Databricks Employee

Hi @mat723,

Great question -- this is a common scenario when you need to allow external partners to access a VNet-injected Azure Databricks workspace. The good news is that VNet injection and SCC primarily affect the compute plane (where your clusters run), not the control plane (where your workspace UI, REST APIs, and SQL endpoints live). So external partner access is very much achievable.

Here are your options, from simplest to most secure:


OPTION 1: PUBLIC ACCESS WITH IP ACCESS LISTS (SIMPLEST)

By default, even with VNet injection and SCC enabled, the workspace front-end (UI, REST APIs, JDBC/ODBC endpoints) remains publicly accessible over the internet unless you have explicitly disabled public network access. This means your partners can connect to the workspace URL and SQL warehouse endpoints directly.

To add a layer of security, configure IP Access Lists to restrict access to only your partners' known IP addresses:

- Use the IP Access Lists REST API to allowlist your partners' public IP ranges
- This ensures only connections from approved IP addresses can reach the workspace
- This works for both REST API calls and JDBC/ODBC connections since both go through the workspace front-end URL

Requires: Premium plan

Docs: https://learn.microsoft.com/en-us/azure/databricks/security/network/front-end/ip-access-list


OPTION 2: CONTEXT-BASED INGRESS CONTROLS (MORE GRANULAR)

If you need more granular control than IP-based filtering, Context-Based Ingress Controls (Public Preview) let you combine multiple conditions -- user identity, request type, and network source -- into allow/deny rules at the account level.

For example, you could create a rule that allows specific service principals used by your partners to access REST APIs from specific IP ranges, while blocking all other public access.

Docs: https://learn.microsoft.com/en-us/azure/databricks/security/network/front-end/context-based-ingress


OPTION 3: AZURE PRIVATE LINK -- FRONT-END (HYBRID MODE)

If you want to keep public access available for your partners while also enabling private connectivity for your internal users, use Front-End Private Link in Hybrid mode:

- Private Link is active for your internal network (via a databricks_ui_api private endpoint in your transit VNet)
- Public access remains enabled, so external partners can still reach the workspace over the internet
- Combine with IP Access Lists to restrict public access to only your partners' IPs

This gives you the best of both worlds: private connectivity for internal users, controlled public access for external partners.

Docs: https://learn.microsoft.com/en-us/azure/databricks/security/network/front-end/front-end-private-conn...


OPTION 4: SITE-TO-SITE VPN OR EXPRESSROUTE (MOST LOCKED DOWN)

If your partners have their own Azure environment or you need to eliminate all public internet exposure:

1. Disable public network access entirely on the workspace
2. Set up Front-End Private Link with the databricks_ui_api private endpoint
3. Connect your partners' networks to your transit VNet using:
- Site-to-Site VPN -- if partners have compatible VPN infrastructure
- Azure ExpressRoute -- for dedicated private connections
- VNet Peering -- if partners are also in Azure

Partners would then access the workspace through the private endpoint, with DNS resolving the workspace URL to the private IP.

Docs: https://learn.microsoft.com/en-us/azure/databricks/security/network/classic/on-prem-network


IMPORTANT NOTES ABOUT JDBC CONNECTIVITY

JDBC and ODBC connections to Azure Databricks go through the workspace front-end URL (port 443), so they are governed by the same front-end networking controls described above. Your partners would connect using:

jdbc:databricks://<workspace-url>:443/default;transportMode=http;ssl=1;httpPath=<sql-warehouse-http-path>;AuthMech=3;...

This means:
- If public access is enabled, JDBC works over the internet (optionally restricted by IP access lists)
- If public access is disabled with Private Link, JDBC only works from networks with access to the private endpoint

Docs: https://learn.microsoft.com/en-us/azure/databricks/integrations/jdbc-oss/


MY RECOMMENDATION

For external partners, Option 1 or Option 3 is typically the best fit:

1. Verify that "Allow Public Network Access" is set to Enabled on your workspace (Settings > Networking in the Azure portal)
2. Configure IP Access Lists to allowlist only your partners' IP ranges
3. Optionally, add Front-End Private Link in Hybrid mode so your internal users get private connectivity

This way, your partners can connect via APIs and JDBC from their known IPs, while your internal traffic stays private and your workspace is protected from unauthorized access.


DOCUMENTATION REFERENCES

- VNet Injection: https://learn.microsoft.com/en-us/azure/databricks/security/network/classic/vnet-inject
- Secure Cluster Connectivity: https://learn.microsoft.com/en-us/azure/databricks/security/network/classic/secure-cluster-connectiv...
- Front-End Networking: https://learn.microsoft.com/en-us/azure/databricks/security/network/front-end/
- IP Access Lists: https://learn.microsoft.com/en-us/azure/databricks/security/network/front-end/ip-access-list
- Front-End Private Link: https://learn.microsoft.com/en-us/azure/databricks/security/network/front-end/front-end-private-conn...
- Private Link Concepts: https://learn.microsoft.com/en-us/azure/databricks/security/network/concepts/private-link
- On-Premises Connectivity: https://learn.microsoft.com/en-us/azure/databricks/security/network/classic/on-prem-network
- JDBC Driver: https://learn.microsoft.com/en-us/azure/databricks/integrations/jdbc-oss/

Hope this helps! Let me know if you need more details on any of these options.

* This reply used an agent system I built to research and draft this response based on the wide set of documentation I have available and previous memory. I personally review the draft for any obvious issues and for monitoring system reliability and update it when I detect any drift, but there is still a small chance that something is inaccurate, especially if you are experimenting with brand new features.

Thanks @SteveOstrowski for the detailed response.

We implemented a slightly different version of option 3 & 4 - stood up a public proxy/gateway which is open only to the partner and the gateway controls incoming traffic & connects to the Databricks workspace with a private link.

Do you see any issues with this approach? I understand there is a cost associated with running the proxy.

SteveOstrowski
Databricks Employee
Databricks Employee

Hi mat723,

That approach is sound and is actually a fairly common pattern. Standing up a public-facing proxy/gateway that is locked down to only the partner's IP ranges, with the backend connecting to the Databricks workspace over Private Link, gives you a clean separation of concerns: your proxy handles partner authentication and ingress filtering, while the workspace itself remains fully private with no public network exposure.

A few things to keep in mind:

  • Security: Make sure your proxy enforces TLS termination and re-encryption on the backend leg to Databricks. Also ensure that the proxy is hardened -- it becomes the sole public entry point, so it needs to be kept patched and monitored. Consider adding WAF rules if using something like Azure Application Gateway or a third-party reverse proxy.
  • Authentication pass-through: The proxy needs to correctly forward Databricks authentication headers (typically a Bearer token for REST APIs, or JDBC auth parameters). Verify that the proxy does not strip or modify Authorization headers.
  • Cost: As you noted, there is a cost to running the proxy infrastructure. Depending on throughput, an Azure Application Gateway or a small VM-based NGINX/HAProxy setup could work. For lower traffic, a lightweight VM is usually sufficient. For production workloads with high concurrency, Application Gateway with autoscaling may be more appropriate.
  • DNS resolution: Since the proxy connects to the workspace via Private Link, it needs to resolve the workspace URL to the private endpoint IP. Make sure your proxy's DNS is configured to use a private DNS zone linked to the VNet where the Private Link endpoint lives.
  • Latency: Adding a proxy hop introduces some latency. For REST API calls this is usually negligible. For high-volume JDBC/ODBC workloads, measure the impact to make sure it meets your performance requirements.

Overall, this is a valid and well-architected pattern. It is essentially a variant of Options 3 and 4 from my earlier response, with the proxy replacing the need for a direct VPN or ExpressRoute connection to the partner. The key advantage is that the partner does not need any Azure infrastructure on their side -- they just need network access to your proxy endpoint.

Sources: