How to authenticate databricks provider in terraform using a system-managed identity?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-12-2024 12:56 AM
- Grant the (system-assigned) managed identity the "Contributor" role on Subscription B. I can confirm via Azure portal that the app service behind the managed identity indeed has the "Contributor" role on the subscription in which the databricks workspace resides.
- Register the managed identity as a databricks service principal in the databricks workspace using its application id.
- Initialize the databricks provider with the following arguments:
- host: host address of the databricks workspace
- azure_workspace_resource_id: resource ID of azure workspace, obtained from an "azurerm_databricks_workspace" data object
- azure_client_id: application id of system-managed identity / registered databricks service principal.
- azure_use_msi: true
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-12-2024 02:55 AM
Dear @Retired_mod,
thanks a lot for your response describing the step-by-step guide to authenticate Databricks using a managed identity.
However, to my best understanding this is not what I want to achieve. To recap, my goal is to use a system-assigned (i.e., not a user-assigned) managed identity of a web app to authenticate with the terraform databricks provider (i.e., not the CLI). I would be very grateful if you could provide a similar step-by-step guide for this setup.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-13-2024 11:42 AM
I furthermore also tried to authenticate using a user-assigned managed identity. In detail, I performed the following steps using Terraform:
- Create a user-assigned managed identity in the same resource group as the databricks workspace
- Create a databricks service principal setting 'application_id' to the client id of the managed identity.
- Assign the managed identity the "Contributor" role on the subscription in which the databricks workspace is located.
- Declare a databricks provider setting 'azure_use_msi' to true, 'host' to the databricks workspace url, 'azure_workspace_resource_id' to the resource id of the databricks workspace, and 'azure_client_id' to the application id of the managed identity.
- Create a databricks token using said provider
The same error ("Identity not found") occurs during the terraform apply of step 5 (token creation). I also tried creating other resources, they all fail with above-stated error message. @alexott, do you have a suggestion?
Thanks a lot for your support!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-22-2024 03:17 AM
I think I have your answer.
To create a databricks provider to manage your workspace using an SPN, you need to create the provider like this:
provider "databricks" {
alias = "workspace"
host = <your workspace URL>
azure_client_id = <Application ID of the SPN>
azure_client_secret = <Application secret of the SPN>
azure_tenant_id = <Your Azure subscription tenant ID>
}
I store all these credentials as secrets in my Azure KeyVault and call the keyvault to have access to all its secrets. Then I define data fields to retrieve the secret values from my KeyVault and pass them in the databricks provider definition. You probably know you need to use azurerm provider for this. Below is the full block:
data "azurerm_key_vault" "key_vault" {
name = <your keyvault_name>
resource_group_name = <your rg_name>
}
data "azurerm_key_vault_secret" "workspace_url" {
name = "<Workspace-URL>"
key_vault_id = data.azurerm_key_vault.key_vault.id
}
data "azurerm_key_vault_secret" "workspace_admin_spn_app_id" {
name = "<Workspace-ADMINSPN-APPLICATIONID>"
key_vault_id = data.azurerm_key_vault.key_vault.id
}
data "azurerm_key_vault_secret" "workspace_admin_spn_app_secret" {
name = "<Workspace-ADMINSPN-APPLICATIONSECRET>"
key_vault_id = data.azurerm_key_vault.key_vault.id
}
data "azurerm_key_vault_secret" "tenant_id" {
name = "<AZURE-TENANTID>"
key_vault_id = data.azurerm_key_vault.key_vault.id
}
provider "databricks" {
alias = "workspace"
host = data.azurerm_key_vault_secret.workspace_url.value
azure_client_id = data.azurerm_key_vault_secret.workspace_admin_spn_app_id.value
azure_client_secret = data.azurerm_key_vault_secret.workspace_admin_spn_app_secret.value
azure_tenant_id = data.azurerm_key_vault_secret.tenant_id.value
}
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2 weeks ago - last edited 2 weeks ago
This answer is for authenticating with a service principal, not a managed identity ("secret-less").
I'm also running into the same error and attempted several permutations of the configuration, including attempting to use a databricks_service_principal_password as a secret. All result in different errors.
data "azurerm_databricks_workspace" "this" {
name = var.databricks_workspace_name
resource_group_name = var.resource_group_name
}
provider "databricks" {
alias = "spn"
host = data.azurerm_databricks_workspace.this.workspace_url
azure_workspace_resource_id = data.azurerm_databricks_workspace.this.id
azure_client_id = data.azuread_service_principal.access_connector.client_id
azure_use_msi = true
}
resource "databricks_token" "access_connector" {
provider = databricks.spn
comment = "${data.azuread_service_principal.access_connector.display_name} PAT"
}
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2 weeks ago - last edited 2 weeks ago
Additionally, the error seems to be intermittent. It affects us greatly because we are using terraform in ci/cd. When it fails, we must re-run it manually. Upon re-running it works, sometimes.
When running it locally, it fails on the first attempt, the succeeds on the second.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2 weeks ago
Hello,
There is a solution for this issue?, I'm facing similar issue on Azure devops with managed identity too.
│ Error: cannot read spark version: cannot read data spark version: failed during request visitor: inner token: token request: {"error":"invalid_request","error_description":"Identity not found"}
Thanks,
Luis

