cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Community Discussions
Connect with fellow community members to discuss general topics related to the Databricks platform, industry trends, and best practices. Share experiences, ask questions, and foster collaboration within the community.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Unable to read data from API due to Private IP Restriction

AravindNani
New Contributor

I have data in my API Endpoint but am unable to read it using Databricks. My data is limited to my private IP address and can only be accessed over a VPN connection. I can't read data into Databricks as a result of this. I can obtain the data in VS Code by using the same code, which runs locally and establishes a VPN connection when accessing data. I can't access the data even with a VPN connection because Databricks is a cloud service. I'm getting a connection timeout error. I'm currently using Databricks Free tier; is there a way for me to resolve this problem?

2 REPLIES 2

Kaniz_Fatma
Community Manager
Community Manager

Hi @AravindNani

  • Ensure that your Databricks cluster has the necessary network configuration to access your API Endpoint. Check if there are any firewall rules or network restrictions that might be causing the timeout.
  • You mentioned that your data is limited to your private IP address and can only be accessed over a VPN connection. Make sure that your Databricks cluster is configured to use the same VPN connection.
  • Verify that your Databricks instance has the appropriate firewall rules to allow communication with your API Endpoint. Missing rules for specific ports (such as port 1433 for SQL databases) can often lead to connection ...1.
  • You can execute the following command in a Databricks Notebook to check if you can connect to your SQL server:
     
  • %sh nc -zv <your_sqlserver_host> 1433
    
  • Sometimes certificate issues can cause connection timeouts. Ensure that your certificates are correctly configured and valid.
  • If youโ€™re using Databricks Free tier, consider upgrading to a higher tier. The Free tier might have limitations that affect network connectivity and performance.
  • Additionally, check if there are any resource constraints or limitations specific to the Free tier that could be causing the issue.
  • Increase the SocketTimeout value in the JDBC connection URL. Sometimes long-running queries can lead to timeouts, and adjusting this parameter might help3.

Wojciech_BUK
Valued Contributor III

Hi AravindNani

This is more of infrastructure questions, you have to make sure that:

1) Your databricks Workspace is provisioned in VNET Injection mode

2) Your VNET is either peered to "HUB" network where you have S2S VPN Connection to API or you have to configure VPN tunnel on VNET where you provisioned databricks. 

3) Use normal cluster (not serverless)

Test your connection by running command in notebook

%sh nc -zv <your_API_Endpoint_IP> <port>

If you want to use api friendly name (FQDN) you will have to deal with DNS resolution.

There is article that explain it:

https://learn.microsoft.com/en-us/azure/databricks/security/network/classic/on-prem-network

----

There are few alternative for that :

  •  you have to setup Static Egrees public IP from Databrick (e.g. NatGW) and whitelist it on your API side 
  • if your API is cloud base, you can try to setup private link or service connection.

 

 

 

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!