cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Connect Databricks to a database protected by a firewall

Arnold_Souza
New Contributor III

We a facing a situation and I would like to understand from the Databricks side what is the best practice regarding that.

Question: Is it possible to have a cluster with a fixed Global IP on Databricks?

Details

We have a vendor that has a SQL Server database stored in Canada in another infrastructure that is not Azure. Their database is protected by a Firewall that limits the computers that can request access to it. All the workstations from the company resolve to the same Public IP. That works like that due to a Zscaler acting as a Flow Network Security. So on the Vendor side, it was easy to put this public IP in the allowed list of their firewall so that the connections could be established by the workstations.

The same behavior seems to not happen when we use Databrickโ€™s Clusters. They seem to acquire a public IP that is dynamic. Therefore, the Vendorโ€™s firewall does not recognize the Fixed IP (or Range of IPs).

So how to connect to an external database using a Databricks Cluster in this situation?

Diagram

4 REPLIES 4

Anonymous
Not applicable

@Arnold Souzaโ€‹ This is a common use case with the customers. You can use an Azure Firewall to create a VNet-injected workspace in which all clusters have a single IP outbound address. The single IP address can be used as an additional security layer with other Azure services and applications that allow access based on specific IP addresses. Please refer to the below KB article for more details.

https://kb.databricks.com/cloud/azure-vnet-single-ip

We have the Databricks workspace Vnet injected. Unfortunately, We can't use a Nat gateway because it is raising an Error during the creation on Azure. The clusters that are managed by Databricks have "Basic" public IP by default and are not "Standard". So the Nat gateway is not supported on the container's public subnet. We do not have an Azure firewall or any NVA in the region where Databrick's workspace is placed.

We have raised a ticket to Databricks via Microsoft to get it solved. Without a proper answer since 27th March 23.

Otherwise, plan โ€œBโ€ is to recreate the workspace in a new subscription where we have Palo Alto Firewalls in place, which has a fixed outgoing IP.

Anonymous
Not applicable

Hi @Arnold Souzaโ€‹ 

Thank you for posting your question in our community! We are happy to assist you.

To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your question?

This will also help other community members who may have similar questions in the future.

Thank you for your participation and let us know if you need any further assistance! 

Anonymous
Not applicable

@Arnold Souzaโ€‹ If you file a support to Azure support they can help customize the Vnet by unlocking it as the Azure Databricks resources are deployed in a managed resource group. Your plan B also should be the way to go if option 1 does not work as expected. Once you deploy a new workspace you can migrate the existing artifacts as mentioned in the below document.

https://github.com/databrickslabs/migrate

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group