07-23-2025 06:53 AM
Hi community,
I’m experiencing a strange issue with my connection from Databricks to an SFTP server.
I provided them with an IP address created for Databricks via a NAT gateway, and that IP is whitelisted on their side. However, even though I have the correct credentials, I’m still having trouble connecting to the SFTP server.
Could you help me understand what might be causing this issue and what I should check or fix?
07-24-2025 04:04 AM - edited 07-24-2025 04:10 AM
Hi @szymon_dybczak,
But, do we need to set an outbound rule in the network security group of databricks?
07-24-2025 04:34 AM - edited 07-24-2025 05:01 AM
Yes, you should have an outbound rule that will allow outbound traffic from databricks subnets to SFTP destination on propert port
07-24-2025 05:18 AM - edited 07-24-2025 05:34 AM
Hi syz,
Do u mean that the source address needs to be the NAT Gateway IP or the databricks subnet? Indeed, the destination IP of the client?
There could be also the needed to have an INBOUND RULE?
ps2: the sftp server is inside their Azure services
07-24-2025 05:58 AM - edited 07-24-2025 05:58 AM
Hi @jeremy98 ,
On your side you should have something like that:
Outbound NSG Rule (on your Databricks subnet NSG):
Field Value
Direction | Outbound |
Priority | Set it according to your NSG rules (lower number means higher priority) |
Source | VirtualNetwork or your Databricks subnet IP range |
Source port | * |
Destination | IP of the SFTP server |
Destination port | 22 |
Protocol | TCP |
Action | Allow |
Name | i.e Allow-SFTP-out |
But to be honest, in this kind of troubleshooting both parties should be involved. Even simple verification on their side, like providing logs with information of IP address is connecting to SFTP could help diagnose the problem faster.
07-24-2025 06:37 AM - edited 07-24-2025 06:38 AM
Hi @szymon_dybczak,
So, the SFTP client needs to whitelist our subnet address? Instead of our NAT gateway IP?
Yep, we are going to ask them if they have some logs, but before they said no..
07-24-2025 07:03 AM - edited 07-24-2025 07:05 AM
Hi @jeremy98 ,
No, on the SFTP they need to whitelist NAT Gateway IP (which they alread did based on your previous messages). So on your side everything looks kind of correct. You have a NAT Gateway, your databricks subnets use that NAT Gateway for outbound traffic and you correctly setup NSG (and I assume you don't use Azure Firewall in your environment).
Moreover, the following test was successfull, so maybe this is not connectivity issue, but rather some kind of issue on SFTP side.
nc -zv your_sftp_address 22
So, it looks like you are able to reach that server, but then SFTP server is terminating the session during or immediately after the auth handshake. Maybe server expects something specific? Maybe a proper SSH version, or SSH banner? Hard to say without proper logs.
07-25-2025 08:29 AM
Hi all,
Thanks again to @szymon_dybczak for the earlier help!
I'm working alongside Jeremy, and we’ve been debugging outbound connectivity for Databricks traffic going through a NAT Gateway, specifically for connecting to an SFTP server also hosted in Azure.
Our current theory is that Azure’s backbone network is overriding NAT Gateway routing, since both source (Databricks) and destination (SFTP) are on Azure.
We found this in the Azure docs on user-defined routes:
If the destination address is for an Azure service, Azure routes the traffic directly to the service over the Azure backbone network instead of routing the traffic to the internet.
Traffic between Azure services doesn't traverse the internet, regardless of region.
You can override the Azure default system route for 0.0.0.0/0 with a custom route.
NAT Gateway is configured and working
Public IP is returned correctly via:
Databricks subnets (both public and private)
Are explicitly associated with the NAT Gateway
Have "private subnet (no default outbound access)" setting enabled
Port 22 is open
NSG rules allow outbound TCP 22
Authentication to the SFTP server succeeds, but fails shortly after login
Egress IP Test Shows Internal Address
When we run a basic egress test like this:
It returns a private IP, not the expected NAT Gateway public IP.
Service endpoints are not enabled
(We previously had Microsoft.Storage as a service endpoint, but removed it to avoid bypassing NAT.)
Are there specific subnet settings (beyond NSG and service endpoint removal) we need to ensure to force public egress?
Do we need a custom route table explicitly targeting 0.0.0.0/0 to Internet to guarantee traffic goes out via the NAT Gateway, even when the destination is another Azure-hosted service?
Any insights or experiences dealing with this kind of internal routing override would be super helpful. Thanks!
Here is a quick sketch how I think the flow is going
07-28-2025 12:14 AM - edited 07-28-2025 12:23 AM
Hi @Kenji_3000 ,
Thanks for all the details. Your suspicion can be correct. I think the best way to check it is to create UDF that will force all outgoing traffic from databricks subnet to go to Internet (and hence use NAT Gateway).
Did you guys get some logs from SFTP admins? What kind of IP address they can see in logs when you're attempting to connect?
07-28-2025 02:05 AM
Hi,
Thanks for your help again.. The client doesn't want to give us any logs, because they said that don't have the logs on their side..
07-28-2025 12:06 AM
up
07-28-2025 02:26 AM
@szymon_dybczak , I was watching that the client password has the '/' inside the password maybe this could be a potential error?
07-28-2025 02:39 AM - edited 07-28-2025 02:44 AM
Hi @jeremy98 ,
Could be, look at following thread. They used an ampersand in password and that give them a headache.
Ampersand in Password - RouterOS / Scripting - MikroTik community forum
In another thread someone had authentication issues and tried to use look_for_keys=False option. Could be worth trying:
ssh.connect(hostname=“x.x.x.x”, port=xxxx, username=“x”, password=“x”, look_for_keys=False)
Also, another thing worth trying is to downgrade paramiko to version 2.8.1 or set disabled_algorithms={'keys': ['rsa-sha2-256', 'rsa-sha2-512']}:
SSH Authentication fails with 2.9.2 · Issue #1984 · paramiko/paramiko
python - Paramiko AuthenticationException issue - Stack Overflow
07-28-2025 03:01 AM
Hi syz,
Doesn't change the issue, very strange 😞
07-28-2025 04:08 AM
Did you try also downgrading paramiko to lower version?
07-29-2025 06:27 AM
Hi @szymon_dybczak ,
We confirmed that it is indeed the backbone network that is causing the issue as we fetched the logs of the sftp
Databricks --> SFTP (region outside Europe West) = Public IP NAT gateway
Databricks --> SFTP (region Europe West) = Private IP
Currently in contact with Microsoft support to overrule this backbone network. I tried to define a route table with
Unfortunately this also not works. Any idea maybe?
Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!
Sign Up Now