cancel
Showing results for 
Search instead for 
Did you mean: 
Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.
cancel
Showing results for 
Search instead for 
Did you mean: 

Databricks Azure Cross-Tenant connection to storage account

Behwar
New Contributor II

I'm currently facing a challenge with establishing a cross-tenant connection between Azure Databricks in Tenant A and a Storage Account in Tenant B. Below is the detailed setup of both tenants:

| Tenant A | Tenant B |
| -------- | -------- |
| Azure Databricks, Disabled Networking | Storage Account (Hierarchical Namespace) - DFS, Disabled Networking |
| 2 Private Endpoints to Databricks - API and Auth | |
| Private Endpoint to Storage Account on Tenant B | Accepted PE Connection |
| Storage Credential with Service Principal from Tenant B | Service Principal with Reader/Explorer Role on SA |
| Test WinVM to Access Databricks | |
| Private DNS Zones for blob & dfs & databricks peered to one VNET | |

I tested the connection to the Storage Account in Tenant B using a curl -X GET command from a Linux VM within the same network and subnet where all resources in Tenant A are deployed. The connection works perfectly when tested from this VM. When the connection is attempted from outside this VM, as expected, it fails.

The issue arises when trying to establish an external location (Unity Catalog) within Databricks (Tenant A). The process fails with the following error:

`Failed to access cloud storage: [AbfsRestOperationException] () exceptionTraceId=9ecf1f01-3b05-42e8-b3e9-1a04edaa8db9`

I would like to mention also that Databricks Public Connection and access to Databricks is disabled, so Im running it from WindowsVM, and address is within the same VNET `10.0.0.0/24` like others resources, through PE for Databricks.

So nslookup resolve this from WinVM:
```
nslookup adb-123.azuredatabricks.net
Server: UnKnown
Address: 168.63.129.16

Non-authoritative answer:
Name: adb-123.privatelink.azuredatabricks.net
Address: 10.0.0.6
Aliases: adb-123.azuredatabricks.net
```

nslookup to StorageAccount on Tenant B, from WinVM on Tenant A
```
nslookup teststorageaccount.dfs.core.windows.net
Server: UnKnown
Address: 168.63.129.16

Non-authoritative answer:
Name: teststorageaccount.privatelink.dfs.core.windows.net
Address: 10.0.0.4
Aliases: teststorageaccount.dfs.core.windows.net
```

Given these results, the configuration in both Tenant A and Tenant B seems to be correct. However, there's an issue with the connection from Databricks to the Storage Account.

**When I enable Networking (public to all) on Storage Account then there is connection from Databricks External Location to Storage Account**, but it's not a solution.

I whitelisted Databricks addresses on SA, but doesn't work either.

| Location | Description | IP Addresses |
| -------- | ----------- | ------------ |
| West Europe | Control Plane IPs | 52.232.19.246/32, 40.74.30.80/32, 20.103.219.240/28, 4.150.168.160/28 |

Tested also Outbound, but didn't help. Ping to my Databricks outside of VNET was to: `40.74.30.80`


Besides that the whole configuration was done via Terraform, but even when I add `skip_validation = true` then from Databricks UI there is still no connection to CrossTenant Storage Account

```hcl
resource "databricks_storage_credential" "crosstenant" {
name = "crosstenant"
azure_service_principal {
application_id = "<snip>"
client_secret = "<snip>"
directory_id = "<snip>"
}
}

resource "databricks_external_location" "crosstenant" {
credential_name = databricks_storage_credential.crosstenant.id
name = "ext-test-crosstenant"
url = "abfss://test@teststorageaccount.dfs.core.windows.net"
# skip_validation = true
}
```

The issue does not appear to be related to Terraform, as it also persists in manual way.

I have tested the setup with and without a Network Security Group (NSG), where NSG rules were configured to allow all traffic (Any Any Allow).

## Storage Account Logging:
I enabled logging on the Storage Account and analyzed the `StorageBlobLogs` for insights. Here’s what I observed:

When I clicked "Test Connection" in Databricks (deployed with skip_validation)
the logs show URI: `https://test-storageaccount.dfs.core.windows.net/test/?upn=false&action=getAccessControl&timeout=90`.

These requests (`callerIP`) originate from IP addresses within the `10.120.x.x` private network range, which is outside the VNETs of both Tenant A and Tenant B, e.g.:
```
10.120.254.135; 10.120.252.170; 10.120.252.215; 10.120.252.173; 10.120.254.207; 10.120.252.131; 10.120.254.163
```

When Public Networking on the Storage Account is Enabled:
The same private IP addresses (10.120.x.x) appear in the logs. However, it is not possible to whitelist these addresses, such as 10.120.252.0/24, as private IP ranges.

I tried also recommended way to use Access Connector Databricks, which is MSI, however, there is no available option to add a Storage Credential using MSI in a cross-tenant scenario. Therefore, I switched to using a Service Principal, which successfully works—but only when public networking is enabled on the Storage Account.

So question is: How to establish Cross Tenant configuration through PE, and keep destination Storage Account without Public Networking (Disabled with PE)?

Any insights or suggestions on what might be going wrong or potential steps to troubleshoot further would be greatly appreciated.

Thank you in advance for your help!

0 REPLIES 0

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group