cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Use Azure Service Principal to Access Azure Devops

SamGreene
Contributor II

There is another thread marked as answered, but it is not a working solution: Solved: How to use Databricks Repos with a service princip... - Page 2 - Databricks Community - 1178...

In Azure Devops, there doesn't seem to be a way to generate a PAT for a service principal user.  I want to use a devops repo as a source for a job step, but this is a blocker. 

You can connect with an Entra ID, which would seem plausible if Databricks implemented it.  Use service principals & managed identities - Azure DevOps | Microsoft Learn

Thanks for the help. 

8 REPLIES 8

SamGreene
Contributor II

Hi, I'm still looking for an answer to this. It seems the recommended way to use repos is NOT to use a shared folder, but if we can't get our SP to source code from Git, then we are stuck using the shared folder repo solution. Being able to use code directly from devops would also remove a step from our production promotion process where we have to pull code from Git to the shared folder. 

SamGreene
Contributor II

Another issue is that another team member created these shared folders and synced them to Git, and now, after they left the team, I need to recreate all the folders to transfer permissions to myself....

I guess the other option is we stop using service principals to run jobs?

Walter_C
Databricks Employee
Databricks Employee

Isnt this method applicable for you https://docs.databricks.com/en/dev-tools/ci-cd/ci-cd-sp.html 

saurabh18cs
Valued Contributor III

1) Is your SP already onboarded for Azure devops? otherwise first step is to granting sp access to azure devops so it is available for your repos and authorized.

2) do you want to do this manually or via pipeline?

3) setup your job runas to this sp.

4)

saurabh18cs_0-1736936663130.png

 

saurabh18cs
Valued Contributor III

Let me share process to authenticate and then execute and test your workflow: offcource you can make below pseudocode more industralized which suits you better

import requests
import json
import os

# Define environment variables and parameters
env = os.getenv('env', 'dev') # Replace with the actual way to get the environment variable
sp_app_id_dev = 'your_sp_app_id_dev'
sp_app_id_acc = 'your_sp_app_id_acc'
sp_app_id_prd = 'your_sp_app_id_prd'
SP_SECRET_DEV = 'your_sp_secret_dev'
SP_SECRET_ACC = 'your_sp_secret_acc'
SP_SECRET_PRD = 'your_sp_secret_prd'
databricks_wrkspc_url_dev = 'your_databricks_wrkspc_url_dev'
databricks_wrkspc_url_acc = 'your_databricks_wrkspc_url_acc'
databricks_wrkspc_url_prd = 'your_databricks_wrkspc_url_prd'
databricks_token = 'your_databricks_token' # Replace with the actual way to get the Databricks token

# Determine the environment and set the corresponding variables
if env == 'dev':
CLIENT_ID = sp_app_id_dev
CLIENT_SECRET = SP_SECRET_DEV
databricksWorkspaceUrl = databricks_wrkspc_url_dev
elif env == 'acc':
CLIENT_ID = sp_app_id_acc
CLIENT_SECRET = SP_SECRET_ACC
databricksWorkspaceUrl = databricks_wrkspc_url_acc
else:
CLIENT_ID = sp_app_id_prd
CLIENT_SECRET = SP_SECRET_PRD
databricksWorkspaceUrl = databricks_wrkspc_url_prd

# Get the OAuth token
token_url = "https://login.microsoftonline.com/<<TENANTID>>/oauth2/v2.0/token"
payload = {
'client_id': CLIENT_ID,
'grant_type': 'client_credentials',
'scope': '499b84ac-1321-427f-aa17-267ca6975798/.default',
'client_secret': CLIENT_SECRET
}

response = requests.post(token_url, data=payload)
response.raise_for_status() # Raise an error for bad status codes
sp_devops_token_val = response.json()
sp_devops_token = sp_devops_token_val.get('access_token')

print(f"SP DevOps Token: {sp_devops_token}")

# Set up Databricks Git credentials
DATABRICKS_GIT_URL = f"{databricksWorkspaceUrl}/api/2.0/git-credentials"
gitConfig = {
"personal_access_token": sp_devops_token,
"git_username": "gbr_sp",
"git_provider": "azureDevOpsServices"
}

headers = {
"Authorization": f"Bearer {databricks_token}",
"Accept": "application/json"
}

# Check if Git credentials already exist
git_exists_response = requests.get(DATABRICKS_GIT_URL, headers=headers)
git_exists_response.raise_for_status()
git_exists = git_exists_response.json().get('credentials', [])

if not git_exists:
# Create new Git credentials
create_response = requests.post(DATABRICKS_GIT_URL, headers=headers, json=gitConfig)
create_response.raise_for_status()
print("Git credentials created successfully.")
else:
# Update existing Git credentials
cred_id = git_exists[0].get('credential_id')
if not cred_id:
print("Credential Id is null")
exit(1)
else:
update_url = f"{DATABRICKS_GIT_URL}/{cred_id}"
update_response = requests.patch(update_url, headers=headers, json=gitConfig)
update_response.raise_for_status()
print("Git credentials updated successfully.")

MadhuB
Contributor III

SP Access to the Databricks workspace - 

The service principal, underlying the Azure DevOps service connection, should be granted the required permissions to the databricks workspace and underlying catalog objects. Create a databricks workflow and make the principal as the owner with execution rights.

Optional Step- You can test this approach with the below sample code to be executed through Azure CLI task from a release pipeline. The SP deploys the code/notebooks through AAD authentication from the build location.

python.exe -m pip install --upgrade pip databricks-cli
$token=$(az account get-access-token --resource 2ff814a6-3304-4ab8-85cb-cd0e6f879c1d --query "accessToken" --output tsv)
$Env:DATABRICKS_AAD_TOKEN = $token
databricks configure --aad-token --host $(DatabricksUCDomain)
databricks --debug workspace import_dir $(System.DefaultWorkingDirectory)/ArtifactsDrop/ ///Workspace/ProjectFolder/ --overwrite

DevOps Service Connection screen to identify the Principal

MadhuB_2-1736978008308.png

 

Workflow Execution -
The SP should be granted Service principal: Manager and Service principal: User roles in the databricks admin console for the successful execution of the Job. Further make the SP as the Owner of the workflow. Refer to the below screens.

Screens to Grant SP access in the admin account console - 

MadhuB_1-1736977857890.png

 

MadhuB_0-1736977808361.png

 

SamGreene
Contributor II

I found this article - it looks like the scenario I am trying to implement.  I want to be able to point a production job running as an SP at an Azure Devops Git asset. 

Use a Microsoft Entra service principal to authenticate access to Azure Databricks Git folders - Azu...

saurabh18cs
Valued Contributor III

Hi @SamGreene have you tried what i suggested above?

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group