cancel
Showing results for 
Search instead for 
Did you mean: 
Administration & Architecture
cancel
Showing results for 
Search instead for 
Did you mean: 

How to bind a User assigned Managed identity to Databricks to access external resources?

sushant047_ms
New Contributor II

Is there a way to bind a user assigned managed identity to Databricks? We want to access some SQL DBs, Redis cache from our Spark code running on Databricks using Managed Identity instead of Service Principals and basic authentication.

As of today, Databricks provides Managed identity access for incoming traffic (i.e., to connect to Databricks from external resources) and not for outgoing traffic. Only thing that works for outgoing is accessing Unity catalog using the connector, but we are looking for resources beyond storage.

e.g., Other resources on Azure support System assigned managed identity and User assigned managed identity under "Identity" tab of the resource. Same is not available for Databricks. We are looking for a workaround or fix for this issue.

Note: Databricks assigned Managed identity present in the Managed Resource group (MRG) is not a scalable solution for us

3 REPLIES 3

Carpender
New Contributor II

I just went through this issue. You can use a user managed identity but you have to pass an access token. You have to enable and add the identity to sql and assign it a role. There is some more in depth documentation above you can find. Then the below code is used. I got this from a resource and not my own code.

%pip install azure-identity

from azure.identity import DefaultAzureCredential, ManagedIdentityCredential
credential = ManagedIdentityCredential(clientId = "<your clientid>")
sqlAzureAccessToken = credential.get_token('https://database.windows.net/.default').token
print(credential.get_token('https://database.windows.net/.default'))

jdbcHostname = "<servername>.database.windows.net"
jdbcDatabase = "<dbname>"
jdbcPort = 1433
jdbcUrl = "jdbc:sqlserver://{0}:{1};database={2}".format(jdbcHostname, jdbcPort, jdbcDatabase)
connectionProperties = {
"accessToken" : sqlAzureAccessToken,
"hostNameInCertificate" : "*.database.windows.net",
"encrypt" : "true",
"ServerCertificate" : "false",
"driver" : "com.microsoft.sqlserver.jdbc.SQLServerDriver"
}

df = spark.read.jdbc(url=jdbcUrl, table="dbo.person", properties=connectionProperties)
display(df)

@Carpender I have tried it with both Databricks assigned identity in the MRG (using DefaultAzureCredential class) and User Assigned Managed Identity (using ManagedIdentityCredential class) and both of them resulted in an exception when I tried reading with the generated token (SQLServerException: Login failed for user '<token-identified principal>')

sushant047_ms
New Contributor II

@Carpender correcting my comment above, Databricks assigned Managed Identity is working and we are able to access but as stated in the original question we are looking for authorization using User Assigned Managed Identity (UAMI). With UAMI we cannot even create a token as the UAMI cannot be assigned/binded to Databricks like other Azure 1P resources. It throws exception as below

Caused by: MsalAzureSDKException: java.util.concurrent.ExecutionException: com.azure.identity.CredentialUnavailableException: ManagedIdentityCredential authentication unavailable. Connection to IMDS endpoint cannot be established.
Caused by: ExecutionException: com.azure.identity.CredentialUnavailableException: ManagedIdentityCredential authentication unavailable. Connection to IMDS endpoint cannot be established.
Caused by: CredentialUnavailableException: ManagedIdentityCredential authentication unavailable. Connection to IMDS endpoint cannot be established.