cancel
Showing results for 
Search instead for 
Did you mean: 
Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.
cancel
Showing results for 
Search instead for 
Did you mean: 

Connecting Azure databricks with firewall enabled Azure storage account

trailblazer
New Contributor II

Hi I am trying to connect from Azure Databrick workspace to Azure gen2 storage account securely. The storage account is set up with these options

1. Enabled from selected virtual networks and IP addresses- we whitelisted few ips

2. Added Microsoft.Databricks/AccessConnector and select the access connector related to this storage account 

3. Allow Azure services on the trusted services list to access this storage account.

With the above settings I am not able to read data from the storage account. If the storage account firewall is opened to all networks then it works but it does not when enable with above settings. 

How do i go about restricting only to the necessary Azure Databricks service ? are there any service tags I need to whitelist ?

Thanks for your help

1 ACCEPTED SOLUTION

Accepted Solutions

szymon_dybczak
Esteemed Contributor III

Hi @trailblazer ,

Yep, in my setup it works as expected. But our environments could be different. I have vnet injected workspace with SCC enabled and private endpoints configured to storage account.
One question, do you use classic compute or serverless? If you use serverless then you need to configure things bit differently to make it work:

Configure a firewall for serverless compute access - Azure Databricks | Microsoft Learn

 

View solution in original post

5 REPLIES 5

mnorland
Valued Contributor

Kyle Hale has an excellent blog post on using the connector:
https://medium.com/@kyle.hale/connecting-to-azure-resources-with-managed-identities-in-databricks-47...

If you have all Kyle has covered, look at the VNet for the ADLS Gen2 to make sure it has connectivity to the VNet your workspace uses.

szymon_dybczak
Esteemed Contributor III

Hi @trailblazer ,

So did you configure it something like this? Did you add your access connector to resources instances?

szymon_dybczak_0-1753294922901.png

 

trailblazer
New Contributor II

Thanks Szymon, 

Yes, I have the exact set up as the above and the access connectors are added to the allowed resource instances list and they have contributor role on the storage account. 

With the above set up, are you able to read/write to the storage account ? For me it is not working unless I "Enable from all network".

Not sure why I get this error message "Please check your Azure Firewall - Full error message: Your request failed with status FAILED: [BAD_REQUEST] This Azure storage request is not authorized. The storage account's 'Firewalls and virtual networks' settings may be blocking access to storage services. Please verify your Azure storage credentials or firewall exception settings

szymon_dybczak
Esteemed Contributor III

Hi @trailblazer ,

Yep, in my setup it works as expected. But our environments could be different. I have vnet injected workspace with SCC enabled and private endpoints configured to storage account.
One question, do you use classic compute or serverless? If you use serverless then you need to configure things bit differently to make it work:

Configure a firewall for serverless compute access - Azure Databricks | Microsoft Learn

 

mkkao924
New Contributor

I am having exact issue as @trailblazer , that if I enable traffic for all network, I can read/write to storage account, if I only allow selected network, including the VNet, then it doesn't. I am using Serverless setup. I also followed the firewall configuration article mentioned above.

Do I need private endpoint setup? If I recall from my reading, if setting up with VNet injection, private endpoint is not required? I currently only have public and private subnet, but I did not setup any private endpoints.