Databricks Community

mnissen1337 · 10 hours ago

I’m trying to understand the networking implications of moving some logic to Databricks Serverless / SDP.

My current setup is a notebook running as a job on classic compute, and this works because outbound traffic goes through a NAT Gateway, so we can expose a stable public IP that a customer can whitelist for their REST API.

As I understand it, Databricks Serverless compute does not run inside my own VNET/VPC, so I can’t simply control outbound IPs in the same way or attach a NAT Gateway.

In that case, I’m unsure how this kind of integration is supposed to be handled if I move the logic into SDP (for example using a custom PySpark data source I’ve built using serverless as the compute.

The use case is still that I need to call a customer’s REST API that requires IP whitelisting.

What is the recommended way to handle this? Thanks!

Ashwin_DSA · 10 hours ago

Hi @mnissen1337,

Your understanding is basically right. With classic compute, the workload runs in your own VNet/VPC, so using your own NAT Gateway to present a stable public egress IP is a standard pattern. With serverless, the compute runs in the Databricks-managed serverless compute plane instead, so you don’t manage egress the same way or attach your own NAT Gateway directly to the compute. Databricks documents that model here for AWS serverless compute plane networking and Azure serverless compute plane networking.

So for a REST API that requires IP allowlisting, the usual serverless pattern is...

If the API is publicly reachable, allowlist the Databricks-managed serverless outbound IPs for your workspace/region, rather than expecting a single NAT IP that you own. On AWS, this is described in Serverless compute firewall configuration. Databricks notes that these outbound IPs are published and can change over time, so they should be treated as a managed allowlist rather than a permanently fixed single IP.
If you really need traffic to come from a single controlled static IP, the cleaner pattern is usually to keep that specific integration on classic compute, or put a proxy/broker in your own network and have serverless reach that privately. Databricks documents the private-connectivity option for internal resources via AWS PrivateLink to resources in your VPC and Azure Private Link to resources in your VNet.

So, basically... if the external API can accept the Databricks-managed serverless egress ranges, serverless can still work well. If the requirement is specifically "one static public IP that I control," that usually points to either classic compute or a private proxy pattern rather than direct egress from serverless.

Hope this helps.

If this answer resolves your question, could you mark it as “Accept as Solution”? That helps other users quickly find the correct fix.

Regards,
Ashwin | Delivery Solution Architect @ Databricks
Helping you build and scale the Data Intelligence Platform.
***Opinions are my own***

View solution in original post

Ashwin_DSA · 10 hours ago

Hi @mnissen1337,

Your understanding is basically right. With classic compute, the workload runs in your own VNet/VPC, so using your own NAT Gateway to present a stable public egress IP is a standard pattern. With serverless, the compute runs in the Databricks-managed serverless compute plane instead, so you don’t manage egress the same way or attach your own NAT Gateway directly to the compute. Databricks documents that model here for AWS serverless compute plane networking and Azure serverless compute plane networking.

So for a REST API that requires IP allowlisting, the usual serverless pattern is...

If the API is publicly reachable, allowlist the Databricks-managed serverless outbound IPs for your workspace/region, rather than expecting a single NAT IP that you own. On AWS, this is described in Serverless compute firewall configuration. Databricks notes that these outbound IPs are published and can change over time, so they should be treated as a managed allowlist rather than a permanently fixed single IP.
If you really need traffic to come from a single controlled static IP, the cleaner pattern is usually to keep that specific integration on classic compute, or put a proxy/broker in your own network and have serverless reach that privately. Databricks documents the private-connectivity option for internal resources via AWS PrivateLink to resources in your VPC and Azure Private Link to resources in your VNet.

So, basically... if the external API can accept the Databricks-managed serverless egress ranges, serverless can still work well. If the requirement is specifically "one static public IP that I control," that usually points to either classic compute or a private proxy pattern rather than direct egress from serverless.

Hope this helps.

If this answer resolves your question, could you mark it as “Accept as Solution”? That helps other users quickly find the correct fix.

Regards,
Ashwin | Delivery Solution Architect @ Databricks
Helping you build and scale the Data Intelligence Platform.
***Opinions are my own***