11-22-2021 08:24 AM
Hello community!
I would like to update a repo from within my Azure DevOps release pipeline.
In the pipeline I generate a token using a AAD Service Principal as recommended, and I setup the databricks api using that token.
When I pass the databricks repos update command, I receive an authenitcation error, which is expected and the service principal has not git configured on the workspace side.
My question is:
Can I configure the repos for the SPN programmatically?
Or, is there a way to provide an Azure Devops token when I make the databricks api call? I have tried passing a token by setting the git AZURE_DEVOPS_EXT_PAT but it doesn't seem to work.
Thank you in advance!
05-10-2022 05:10 AM
The solution depends on accepting ServicePrincipal tokens credentials as authentication for ADO. There is mention of it on their roadmap (https://docs.microsoft.com/en-us/azure/devops/release-notes/features-timeline) but no timeline is defined yet.
As a workaround we use a user PAT for the time being.
07-22-2022 05:10 AM
I see that microsoft has a draft timeline regarding this..... FY23Q1 (TBC): Public Preview to all internal and external customers. I don't know why it takes them so long.
12-01-2022 01:39 PM
Any updates?
12-01-2022 05:23 PM
I was able to accomplish this by getting an AAD service account (not the service principal used by CI/CD) and got rights to use that account to connect to our github repo. I used the databricks git config api to configure the service principal user git config with the AAD service account and PAT that I generated and authorized in github. I then used the Repos CLI to create the repo in databricks under the service principal. I just got this working today. Note that I'm still trying to figure out how to update from github as we are not making any changes within databricks itself.
02-16-2023 05:42 AM
Dave, do you have a moment to guide me through these steps of yours?
Help would be more than appreciated!!!
02-17-2023 02:31 AM
Hi @Gent Reshtani
As SPN is not recognized by Azure DevOps yet, we must use a service account for the repo part, all the other part should be still with SPN.
Be aware that you need to refresh the PAT. and update the git crendential periodically.
02-20-2023 04:25 AM
I am using Github as my git provider, not Azure Devops. The way that I understand it, I need to update the git credentials for my Service Provider. That's my question, how do I make sure that my SPN is using the correct git credentials (if the PAT token has changed, for example)
02-20-2023 05:12 AM
although databricks api displays the git cred in a list, in fact you can only bind one git credential per SPN. So if it works after the binding, it's the good one, if not, change it.
02-20-2023 05:37 AM
@Xiang ZHU ,
Okay, I am very close to being able to change the git credential for my Service Principal. My inital error was that, the service principal didn't have the correct permissions to view the Repo. I suspect this has changed because of the change in the PAT token (it used to work before).
How do I leverage the Git Credentials API to make this change? Do i use Postman, or something similar?
Apologies for the low-level question here.
02-20-2023 06:01 AM
use whatever tool you like to call the API, postman is certainly OK.
Just follow the official API guide: https://docs.databricks.com/dev-tools/api/latest/gitcredentials.html#operation/create-git-credential
only 3 fields in the payload, and the official example is already for github pat.
N.B. call the API with the SPN access token for API authentication. hereunder a snippet to get the access token
curl -X POST \
"https://login.microsoftonline.com/$tenantId/oauth2/v2.0/token" \
-H "Content-Type: application/x-www-form-urlencoded" \
-d "client_id=$client_id" \
-d 'grant_type=client_credentials' \
-d 'scope=2ff814a6-3304-4ab8-85cb-cd0e6f879c1d%2F.default' \
-d "client_secret=$client_secret" \
| jq -r .access_token
02-20-2023 08:44 AM
Here is a sample in powershell and Azure which checks if it is already configured - if not then it sets the git config for the user - you would just want to tweak it to change the pat if needed:
"Getting current git-credentials..."
$uri = $databricksUrl + "/api/2.0/git-credentials"
$headers = @{
"Authorization" = "Bearer $databricksToken"
"X-Databricks-Azure-SP-Management-Token" = $azToken
"X-Databricks-Azure-Workspace-Resource-Id" = $wsId
}
"Headers: " +$headers
"Checking if git-config already exists."
$gitconfig = Invoke-RestMethod -Uri $uri -Headers $headers
if (![String]::IsNullOrWhitespace($gitconfig)) {
"Git config already exists"
} else {
$body = '{
"personal_access_token": $gitPat,
"git_username": $gitUsername,
"git_provider": "gitHub"
}'
$gitconfig = Invoke-RestMethod -Method 'Post' -Uri $uri -Headers $headers -Body $body -ContentType "application/json"
$gitconfig
}
02-21-2023 08:55 AM
>Thank you for this script. It provided me with additional info into Databricks access keys.
>I ran the following CURL using bash (essentially the same)
"
curl -X PATCH -H "Authorization: Bearer $DB_TOKEN" \
-H "X-Databricks-Azure-SP-Management-Token: $AZ_TOKEN" \
-H "X-Databricks-Azure-Workspace-Resource-Id: $WS_ID" \
-d '{"personal_access_token": "$PAT", "git_username": "$GITUSER", "git_provider": "gitHub"}' \
https://$DATABRICKS_URL/api/2.0/git-credentials/801978151980718
"
>It works. I can then use the headers to run different information from the workspace.
>My main issue, is this error One issue that I am getting though, is when I run the following curl:
{"error_code":"PERMISSION_DENIED","message":"PERMISSION_DENIED: Missing required permissions [View] on node with ID '1759335429158542'"}
However, I am unable to locate anything with that ID. I can't view it, I can't delete it.
"
curl -X GET -H "Authorization: Bearer $DB_TOKEN" \
-H "X-Databricks-Azure-SP-Management-Token: $AZ_TOKEN" \
-H "X-Databricks-Azure-Workspace-Resource-Id: $WS_ID" \
https://adb-7866570032917376.16.azuredatabricks.net/api/2.0/repos/1759335429158542
"
This is causing terraform to fail. Do you have any idea what could cause this to fail?
02-28-2023 01:30 PM
02-28-2023 03:07 PM
the git_username is the service account name, this API is to bind service account name's PAT to the SP, this is why you need to use SP's access token in the API auth header
01-10-2023 01:51 PM
My use case is to create dbt task inside of a databricks workflow.
It needs to specify `git_source`, and my workflow is run under a service principal account.
Unfortunately, from the beginning of the workflow run. an error is raised like:
"Failed to checkout Git repository: PERMISSION_DENIED: Encountered an error with your Azure Active Directory credentials. Please try logging out of Azure Active Directory (https://portal.azure.com) and logging back in."
But I've no where to grant to the service principal the Azure DevOps checkout permission.
Replacing the service principal with an standard user account works, but we can not use user account in production.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group