05-02-2025 09:45 AM - edited 05-02-2025 09:46 AM
Hi Databricks Community,
I'm working through some networking challenges when connecting Databricks clusters to various data sources and wanted to get advice or best practices from others who may have faced similar issues.
Current Setup:
I have four types of source systems that I need to connect to from Databricks:
1. Customer Plane Clusters → Source in Azure VNet
Approach: Peered the Databricks Customer VNet with the source system’s VNet.
Connectivity: Whitelisted the NAT Gateway Public IP in the source system’s firewall.
2. Customer Plane Clusters → On-Prem System
Approach: Established a Site-to-Site VPN between the Databricks Customer VNet and On-Prem network.
Connectivity: Whitelisted the private IPs on the on-prem side.
3. Control Plane Clusters (Serverless Compute) → Azure Services (Storage Account, MySQL, etc.)
Approach: Using Network Connectivity Configuration (NCC) in Databricks.
4. Control Plane Clusters (Serverless Compute) → On-Prem System
Approach: Not applicable yet — looking for guidance here.
Connectivity challenge: Unable to establish direct connectivity due to lack of support for peering or site-to-site connections from Control Plane to On-Prem.
The Problem: However, I'm running into networking limitations when trying to connect Serverless Compute (Control Plane) to systems behind firewalls - especially in on-premises or other CSPs / SaaS applications.
Issue A: No Static Outbound IPs for Serverless Compute
For external systems behind a firewall, there is no static public IP address available from serverless compute to whitelist.
Issue B: No Network-Level Integration with On-Prem Networks
Unlike customer-managed clusters, serverless compute does not support peering or site-to-site connections, and there is no direct network-level communication.
Issue C: Limited Support for Hybrid or Multi-Cloud Scenarios
There’s currently no supported way to securely connect Databricks serverless compute to:
2 weeks ago
Greetings @chandru44 , Thanks for sharing this detailed networking setup—you've clearly done thorough work mapping out your connectivity patterns. You've correctly identified the fundamental architectural limitation with serverless compute and on-premises connectivity. Let me address your concerns and provide some practical guidance.
You're absolutely right that Databricks serverless compute does not currently support direct network-level integration with on-premises systems. This is by design—serverless compute runs in the Databricks-managed control plane, not in your customer VNet, which means traditional networking approaches (VNet peering, Site-to-Site VPN) cannot be applied.
Network Connectivity Configurations (NCC) are designed specifically for Azure-native resources only (Storage Accounts, MySQL, Cosmos DB, etc.) using Azure Private Link and managed private endpoints. NCC cannot establish connectivity to on-premises systems because it relies on Azure's Private Link infrastructure, which doesn't extend to on-prem networks.
The most reliable approach is to use a staging/synchronization layer:
1. Use your customer plane clusters (with Site-to-Site VPN connectivity) to extract data from on-prem systems
2. Land the data in Azure storage (ADLS Gen2, Blob Storage) with appropriate governance
3. Use serverless compute to process the data in Azure
4. Optionally sync processed results back to on-prem via customer plane clusters
This pattern prevents overwhelming on-prem systems with cloud-scale operations and provides better performance and cost optimization.
If real-time access is required:
1. Deploy an Azure Application Gateway or API Management service in your Azure VNet
2. Configure ExpressRoute or Site-to-Site VPN from Azure to on-prem
3. Expose on-prem services through this gateway with appropriate authentication
4. Configure NCC firewall rules to allow serverless compute to access the gateway's subnet
Note: This still requires the gateway to be an Azure resource that NCC can reach.
For workloads that must have direct on-prem connectivity, continue using customer plane clusters with VNet injection. You can design a hybrid architecture where:
- Serverless compute handles Azure-native workloads (better performance, lower cost)
- Classic compute clusters handle on-prem connectivity workloads
- Both layers interact via Unity Catalog tables
Issue A - No Static Outbound IPs: Correct—serverless compute does not provide static public IPs for whitelisting. For customer plane clusters, you can whitelist NAT Gateway IPs, but this isn't available for serverless.
Issue B - No Network Integration: This is a fundamental architectural constraint. Serverless compute prioritizes rapid scaling and zero infrastructure management over custom networking.
Issue C - Multi-Cloud/SaaS Limitations: For external systems requiring IP whitelisting, the hybrid sync pattern (Option 1) or exposing services via Azure-native endpoints (Option 2) are your best options.
- Delta Sharing: For governed data distribution between on-prem and cloud environments, consider using Delta Sharing to share curated datasets without data duplication.
- Cost Optimization: The hybrid approach actually helps with cost—you avoid unnecessary data egress charges and can optimize compute usage based on workload type.
- Security: The staging pattern provides better data governance and audit trails compared to direct connectivity.
Unfortunately, there's no direct solution for serverless-to-on-prem connectivity in the current architecture. The workarounds above represent the practical approaches used by organizations facing similar constraints.
Hope this helps clarify your options!
Cheers, Louis.
3 weeks ago
Thank you for posting this question. I am encountering the exact same scenarios with Databricks serverless compute while trying to connect to on-prem systems via site-to-site VPN as well as third party SaaS applications requiring IP-based access control. Has anyone figured out the correct way to address these issues?
Thanks!
2 weeks ago
Greetings @chandru44 , Thanks for sharing this detailed networking setup—you've clearly done thorough work mapping out your connectivity patterns. You've correctly identified the fundamental architectural limitation with serverless compute and on-premises connectivity. Let me address your concerns and provide some practical guidance.
You're absolutely right that Databricks serverless compute does not currently support direct network-level integration with on-premises systems. This is by design—serverless compute runs in the Databricks-managed control plane, not in your customer VNet, which means traditional networking approaches (VNet peering, Site-to-Site VPN) cannot be applied.
Network Connectivity Configurations (NCC) are designed specifically for Azure-native resources only (Storage Accounts, MySQL, Cosmos DB, etc.) using Azure Private Link and managed private endpoints. NCC cannot establish connectivity to on-premises systems because it relies on Azure's Private Link infrastructure, which doesn't extend to on-prem networks.
The most reliable approach is to use a staging/synchronization layer:
1. Use your customer plane clusters (with Site-to-Site VPN connectivity) to extract data from on-prem systems
2. Land the data in Azure storage (ADLS Gen2, Blob Storage) with appropriate governance
3. Use serverless compute to process the data in Azure
4. Optionally sync processed results back to on-prem via customer plane clusters
This pattern prevents overwhelming on-prem systems with cloud-scale operations and provides better performance and cost optimization.
If real-time access is required:
1. Deploy an Azure Application Gateway or API Management service in your Azure VNet
2. Configure ExpressRoute or Site-to-Site VPN from Azure to on-prem
3. Expose on-prem services through this gateway with appropriate authentication
4. Configure NCC firewall rules to allow serverless compute to access the gateway's subnet
Note: This still requires the gateway to be an Azure resource that NCC can reach.
For workloads that must have direct on-prem connectivity, continue using customer plane clusters with VNet injection. You can design a hybrid architecture where:
- Serverless compute handles Azure-native workloads (better performance, lower cost)
- Classic compute clusters handle on-prem connectivity workloads
- Both layers interact via Unity Catalog tables
Issue A - No Static Outbound IPs: Correct—serverless compute does not provide static public IPs for whitelisting. For customer plane clusters, you can whitelist NAT Gateway IPs, but this isn't available for serverless.
Issue B - No Network Integration: This is a fundamental architectural constraint. Serverless compute prioritizes rapid scaling and zero infrastructure management over custom networking.
Issue C - Multi-Cloud/SaaS Limitations: For external systems requiring IP whitelisting, the hybrid sync pattern (Option 1) or exposing services via Azure-native endpoints (Option 2) are your best options.
- Delta Sharing: For governed data distribution between on-prem and cloud environments, consider using Delta Sharing to share curated datasets without data duplication.
- Cost Optimization: The hybrid approach actually helps with cost—you avoid unnecessary data egress charges and can optimize compute usage based on workload type.
- Security: The staging pattern provides better data governance and audit trails compared to direct connectivity.
Unfortunately, there's no direct solution for serverless-to-on-prem connectivity in the current architecture. The workarounds above represent the practical approaches used by organizations facing similar constraints.
Hope this helps clarify your options!
Cheers, Louis.
2 weeks ago
Thank you Louis for the detailed explanation and guidance!
Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!
Sign Up Now