cancel
Showing results for 
Search instead for 
Did you mean: 
Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.
cancel
Showing results for 
Search instead for 
Did you mean: 

Why is Databricks Using Private IP Instead of NAT Gateway's Public IP to Connect with Source System

chandru44
New Contributor

I have a publicly accessible SQL database that is protected by a firewall. I am trying to connect this SQL database to Databricks, but I'm encountering an authentication error. I have double-checked the credentials, port, and host, and they are all correct.

To allow the connection, I whitelisted the public IP of the NAT gateway attached to the Databricks subnet in the SQL database firewall. However, when I checked the SQL database logs, I noticed that the incoming requests are not coming from the NAT gateway's public IP. Instead, the requests are originating from the private IP of the Databricks all-purpose cluster.

Why is this happening, and how can I resolve it?

1 REPLY 1

Aviral-Bhardwaj
Esteemed Contributor III

The issue occurs because the Databricks cluster's outbound traffic isn't routed through the NAT Gateway due to misconfigured network settings or conflicting outbound connectivity configurations. This should be mostly resolved by your Network team but you can follow these steps as well as

1. Verify NAT Gateway Configuration

  • Ensure the NAT Gateway is directly attached to the Databricks cluster's subnet.

  • Confirm the NAT Gateway has a public IP or prefix assigned.

  • Check User-Defined Routes (UDRs) in the subnet's route table to ensure all outbound traffic (0.0.0.0/0) is directed to the NAT Gateway.

2. Check for Conflicting Outbound Configurations

  • Instance-level public IPs on Databricks VMs or Azure Load Balancer outbound rules can override the NAT Gateway1. Remove these if present.

  • If Azure Firewall is deployed, ensure it isn’t handling outbound traffic for the subnet.

3. Validate Traffic Flow

  • Use a test notebook to check the outbound IP:

     
    python
    %sh curl ifconfig.me

    The result should match the NAT Gateway’s public IP.

  • If the IP is incorrect, restart the cluster to force new connections through the NAT Gateway.

4. Whitelist All NAT Gateway IPs

  • If the NAT Gateway uses multiple public IPs, all must be whitelisted in the SQL database firewall.

5. Ensure Subnet Association

  • Confirm the Databricks cluster’s VNet-injected subnet is the same subnet where the NAT Gateway is attached.

6. Check Secure Cluster Connectivity (SCC)

  • SCC being disabled allows the use of NAT Gateway without restrictions. If SCC is enabled, it may bypass the NAT Gateway.

Example Route Table Configuration

Address Prefix Next Hop Type Next Hop IP Address
0.0.0.0/0NAT Gateway(NAT Gateway IP)

By addressing these points, your Databricks cluster’s outbound traffic will use the NAT Gateway’s public IP, allowing the SQL database firewall to recognize the whitelisted IP.

Aviral 🙂 😁

 

AviralBhardwaj

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now