How to bind a User assigned Managed identity to Databricks to access external resources?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-02-2024 04:20 AM
Is there a way to bind a user assigned managed identity to Databricks? We want to access some SQL DBs, Redis cache from our Spark code running on Databricks using Managed Identity instead of Service Principals and basic authentication.
As of today, Databricks provides Managed identity access for incoming traffic (i.e., to connect to Databricks from external resources) and not for outgoing traffic. Only thing that works for outgoing is accessing Unity catalog using the connector, but we are looking for resources beyond storage.
e.g., Other resources on Azure support System assigned managed identity and User assigned managed identity under "Identity" tab of the resource. Same is not available for Databricks. We are looking for a workaround or fix for this issue.
Note: Databricks assigned Managed identity present in the Managed Resource group (MRG) is not a scalable solution for us
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-02-2024 05:44 PM
I just went through this issue. You can use a user managed identity but you have to pass an access token. You have to enable and add the identity to sql and assign it a role. There is some more in depth documentation above you can find. Then the below code is used. I got this from a resource and not my own code.
%pip install azure-identity
from azure.identity import DefaultAzureCredential, ManagedIdentityCredential
credential = ManagedIdentityCredential(clientId = "<your clientid>")
sqlAzureAccessToken = credential.get_token('https://database.windows.net/.default').token
print(credential.get_token('https://database.windows.net/.default'))
jdbcHostname = "<servername>.database.windows.net"
jdbcDatabase = "<dbname>"
jdbcPort = 1433
jdbcUrl = "jdbc:sqlserver://{0}:{1};database={2}".format(jdbcHostname, jdbcPort, jdbcDatabase)
connectionProperties = {
"accessToken" : sqlAzureAccessToken,
"hostNameInCertificate" : "*.database.windows.net",
"encrypt" : "true",
"ServerCertificate" : "false",
"driver" : "com.microsoft.sqlserver.jdbc.SQLServerDriver"
}
df = spark.read.jdbc(url=jdbcUrl, table="dbo.person", properties=connectionProperties)
display(df)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-03-2024 04:39 AM
@Carpender I have tried it with both Databricks assigned identity in the MRG (using DefaultAzureCredential class) and User Assigned Managed Identity (using ManagedIdentityCredential class) and both of them resulted in an exception when I tried reading with the generated token (SQLServerException: Login failed for user '<token-identified principal>')
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-05-2024 01:53 AM
@Carpender correcting my comment above, Databricks assigned Managed Identity is working and we are able to access but as stated in the original question we are looking for authorization using User Assigned Managed Identity (UAMI). With UAMI we cannot even create a token as the UAMI cannot be assigned/binded to Databricks like other Azure 1P resources. It throws exception as below
Caused by: MsalAzureSDKException: java.util.concurrent.ExecutionException: com.azure.identity.CredentialUnavailableException: ManagedIdentityCredential authentication unavailable. Connection to IMDS endpoint cannot be established.
Caused by: ExecutionException: com.azure.identity.CredentialUnavailableException: ManagedIdentityCredential authentication unavailable. Connection to IMDS endpoint cannot be established.
Caused by: CredentialUnavailableException: ManagedIdentityCredential authentication unavailable. Connection to IMDS endpoint cannot be established.