Databricks

FerArribas · ‎01-02-2023

Hi,

Thoroughly investigating the best security practices for accessing the Databricks WEB UI. I have doubts about the difference between protecting the WEB UI with (1) IP Access list (https://learn.microsoft.com/en-us/azure/databricks/security/network/ip-access-list) or (2) disabling public access (access by private link on the front).

The doubts arise because in both options the Web UI is exposed to the Internet (public IP) and it seems to apply the same "firewall" to both options. We want to know if the only difference between both options is that in the second, the only IP allowed (similar to ip access list with a single ip configured and no ability to change) to access the WEB UI is the private IP of the private endpoint (front).

In short, would it be the same to configure only the IP of the private endpoint in the IP access list vs disable public access?

NOTE: For the backend connection we have no doubts about the benefits in terms of security for the connection between the data panel and the control panel.

Thanks,

Rik · ‎04-06-2023

"In short, would it be the same to configure only the IP of the private endpoint in the IP access list vs disable public access?"

The access list doesn't apply to private IPs, only to public IP (internet). Relevant part from the docs:

"If you use PrivateLink, note that IP access lists apply only to requests over the internet (public IP addresses). Private IP addresses from PrivateLink traffic cannot be blocked by IP access lists. To block specific private IP addresses from PrivateLink traffic, use AWS Network Firewall."

A rule of thumb: public IPs can only connect to public endpoints, private IPs can only connect to private endpoints.

The most "secure" way is only accessing the workspace through Private Link (your private endpoint), but keep in mind this is only as secure as your private network. You should identify all sources that need access to your workspace (end users, devops agents, SCIM services, other services) and try to inject them into your private network as much as possible.

There are cases when you might still need to expose your public endpoint, because some services/traffic only run from internet (for instance AAD SCIM provisioning or (public) devops build agents). For such cases, you still need to apply the access list to restrict access as much as possible (but keep in mind that you often don't control these IPs, so they may change from time-to-time).

View solution in original post

Yelf · ‎02-08-2023

Hey,

I have been investigating security practices as well for Azure Databricks. It seems disabling public access, with the workspace having a private endpoint and a private link, still resolves a public ip (francecentral.azuredatabricks.net), the private link behind a front end link.

Wondering if you had further insights on this ?

Rik · ‎08-07-2023

@Yelf, this is how private link works. Once you enable private link, your per-workspace URL will resolve to privatelink.workspace-url.azuredatabricks.net,

If you are on your private network, you should have a private DNS zone setup that resolves this domainname to a private IP. Otherwise (if you use public DNS), it will resolve to the region-specific URL (which, in turn, will resolve to a public IP).

So:

There will still always be a CNAME (and public IP) resolution for your per-workspace URL. If you disable public access, it will just block all traffic on this (public) endpoint.
If you don't get resolved to a private IP on your private network, make sure to
a) check if the per-region URL resolves to privatelink-url (if not, the private link is misconfigured or not approved)
b) check your private DNS zone for an A record mapping this privatelink-hostname to your private IP

Rik · ‎04-06-2023

"In short, would it be the same to configure only the IP of the private endpoint in the IP access list vs disable public access?"

The access list doesn't apply to private IPs, only to public IP (internet). Relevant part from the docs:

"If you use PrivateLink, note that IP access lists apply only to requests over the internet (public IP addresses). Private IP addresses from PrivateLink traffic cannot be blocked by IP access lists. To block specific private IP addresses from PrivateLink traffic, use AWS Network Firewall."

A rule of thumb: public IPs can only connect to public endpoints, private IPs can only connect to private endpoints.

The most "secure" way is only accessing the workspace through Private Link (your private endpoint), but keep in mind this is only as secure as your private network. You should identify all sources that need access to your workspace (end users, devops agents, SCIM services, other services) and try to inject them into your private network as much as possible.

There are cases when you might still need to expose your public endpoint, because some services/traffic only run from internet (for instance AAD SCIM provisioning or (public) devops build agents). For such cases, you still need to apply the access list to restrict access as much as possible (but keep in mind that you often don't control these IPs, so they may change from time-to-time).

Databricks

Azure Databricks - Difference between protecting the WEB UI with IP Access list or disabling public access?

Registration now open! Databricks Data + AI Summit 2024

Meet DBRX, the New Standard for High-Quality LLMs

Data Warehousing in the Era of AI