08-22-2024 03:31 AM
08-22-2024 03:58 AM
Hi @Filip ,
It's obsolete way of configuring access to storage account. Nowadays you should use UC and storage credentials and external location to configure access to storage account.
A storage credential is a securable object representing an Azure managed identity or Microsoft Entra ID service principal. Once a storage credential is created access to it can be granted to principals (users and groups).Storage credentials are primarily used to create external locations, which scope access to a specific storage path
Storage credentials - Azure Databricks - Databricks SQL | Microsoft Learn
08-22-2024 05:17 AM
Yea I'm aware of that UC is fixing that but I'm not on UC yet and wanted to know if it is even possible to use our own user assigned managed identity and assign it instead of using built-in one as it looks it os not really possible for some reason.
08-22-2024 05:26 AM
Ok, so unfortunately using User Assigned Managed Identity to read/write from ADLS Gen2 inside a notebook is not directly supported. Your best bet is to use regular service principal or switch to unity catalog.
12-04-2024 01:00 AM
Hi,
I can be accessed with the following code.
storageAccountName = "my-storage-account-name"
applicationClientId = "my-umi-client-id"
aadDirectoryId = "my-entra-tenant-id"
containerName = "my-lake-container"
spark.conf.set("fs.azure.account.auth.type", "OAuth")
spark.conf.set("fs.azure.account.oauth.provider.type","org.apache.hadoop.fs.azurebfs.oauth2.MsiTokenProvider")
spark.conf.set("fs.azure.account.oauth2.msi.tenant", aadDirectoryId)
spark.conf.set("fs.azure.account.oauth2.client.id", applicationClientId)
df = spark.read.csv("abfss://"+containerName+"@"+storageAccountName+".dfs.core.windows.net/hello.csv")
df.show()I too would like to change to UC but can't take the time to do so...
05-14-2025 11:42 AM
hi. could you tell me what were the cluster settings you used for this one.
Monday
Best option is to use "External Locations" and "Storage Credentials" under "Unity Catalog". This avoids tons of problems.
If Unity Catalog is not possible, only way to achieve this I was able to verify is:
When you deploy a Databricks workspace, a user-assigned dbManagedIdentity is created in background. Not sure if this only happens in non-VNET injected workspaces or for all setups (pending to check)
Previous dbManagedIdentity is then associated to VMs making up the cluster. So, if you assign proper roles to this managed identity over ADLS, it turns out DefaultAzureCredentials gets this "identity" (checked in access token payload and code on my own) and the code pasted above really works. VMs are using this managed identity to access ADLS 🙂 but (I think) nothing to do with Azure Databricks Connector.
Check here how user-assigned managed identity is assigned to VMs
Kind Regards.
Monday
Besides, this only works in dedicated clusters, non working on shared ones. Why? No idea at all. Latest case, IMDS (Internal Metadata Service) used by Azure to inject token endpoint inside resources as a unique secure and valid channel to get tokens for managed identities is not created or not accessible. So, we get this error:
Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!
Sign Up Now