cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

How to authenticate databricks provider in terraform using a system-managed identity?

felix_counter
New Contributor III
Hello,
I want to authenticate the databricks provider using a system-managed identity in Azure. The identity resides in a different subscription than the databricks workspace:
 
managed identity.png
According to the "authentication" section of the databricks provider documentation, I performed the following steps:

 

  1. Grant the (system-assigned) managed identity the "Contributor" role on Subscription B. I can confirm via Azure portal that the app service behind the managed identity indeed has the "Contributor" role on the subscription in which the databricks workspace resides.
  2. Register the managed identity as a databricks service principal in the databricks workspace using its application id.
  3. Initialize the databricks provider with the following arguments:
    • host: host address of the databricks workspace
    • azure_workspace_resource_id: resource ID of azure workspace, obtained from an "azurerm_databricks_workspace" data object
    • azure_client_id: application id of system-managed identity / registered databricks service principal. 
    • azure_use_msi: true
I tried to create a resource using this provider.
The terraform plan step looks good, i.e. the resource I want to create shows up in the planning step. However, during the apply step I encounter the following error:
 
 Error: cannot create [redacted]: inner token: token error: ***"error":"invalid_request","error_description":"Identity not found"***
 
This error appears independent of the created resource (I tried several ones). The problems seems to be in the authentication with the managed identity.
 
Is it possible to authenticate the databricks provider using a system-managed identity? If yes, what would be the correct configuration for the provider and the environment in this setup? I am a bit confused on how to point the provider at the right identity to use. In order to point the provider to the correct identity / SPN, I set the parameter "azure_client_id" to the managed identity's application id. However, I am not sure whether this is correct.
 
3 REPLIES 3

Dear @Retired_mod,

thanks a lot for your response describing the step-by-step guide to authenticate Databricks using a managed identity.

However, to my best understanding this is not what I want to achieve. To recap, my goal is to use a system-assigned (i.e., not a user-assigned) managed identity of a web app to authenticate with the terraform databricks provider (i.e., not the CLI). I would be very grateful if you could provide a similar step-by-step guide for this setup.

felix_counter
New Contributor III

I furthermore also tried to authenticate using a user-assigned managed identity. In detail, I performed the following steps using Terraform:

  1. Create a user-assigned managed identity in the same resource group as the databricks workspace
  2. Create a databricks service principal setting 'application_id' to the client id of the managed identity. 
  3. Assign the managed identity the "Contributor" role on the subscription in which the databricks workspace is located.
  4. Declare a databricks provider setting 'azure_use_msi' to true, 'host' to the databricks workspace url, 'azure_workspace_resource_id' to the resource id of the databricks workspace, and  'azure_client_id' to the application id of the managed identity.
  5. Create a databricks token using said provider

The same error ("Identity not found") occurs during the terraform apply of step 5 (token creation). I also tried creating other resources, they all fail with above-stated error message. @alexott, do you have a suggestion?

Thanks a lot for your support! 

FarBo
New Contributor III

@felix_counter 

I think I have your answer.

To create a databricks provider to manage your workspace using an SPN, you need to create the provider like this:

provider "databricks" {
  alias      = "workspace"
  host       = <your workspace URL>
  azure_client_id = <Application ID of the SPN>
  azure_client_secret = <Application secret of the SPN>
  azure_tenant_id = <Your Azure subscription tenant ID>
}

I store all these credentials as secrets in my Azure KeyVault and call the keyvault to have access to all its secrets. Then I define data fields to retrieve the secret values from my KeyVault and pass them in the databricks provider definition. You probably know you need to use azurerm provider for this. Below is the full block:

data "azurerm_key_vault" "key_vault" {
  name                = <your keyvault_name>
  resource_group_name = <your rg_name>
}

data "azurerm_key_vault_secret" "workspace_url" {
  name                = "<Workspace-URL>"
  key_vault_id = data.azurerm_key_vault.key_vault.id
}

data "azurerm_key_vault_secret" "workspace_admin_spn_app_id" {
  name                = "<Workspace-ADMINSPN-APPLICATIONID>"
  key_vault_id = data.azurerm_key_vault.key_vault.id
}

data "azurerm_key_vault_secret" "workspace_admin_spn_app_secret" {
  name                = "<Workspace-ADMINSPN-APPLICATIONSECRET>"
  key_vault_id = data.azurerm_key_vault.key_vault.id
}

data "azurerm_key_vault_secret" "tenant_id" {
  name                = "<AZURE-TENANTID>"
  key_vault_id = data.azurerm_key_vault.key_vault.id
}

provider "databricks" {
  alias      = "workspace"
  host       = data.azurerm_key_vault_secret.workspace_url.value
  azure_client_id = data.azurerm_key_vault_secret.workspace_admin_spn_app_id.value
  azure_client_secret = data.azurerm_key_vault_secret.workspace_admin_spn_app_secret.value
  azure_tenant_id = data.azurerm_key_vault_secret.tenant_id.value
}

 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group