cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

How to use Databricks Repos with a service principal for CI/CD in Azure DevOps?

michael_mehrten
New Contributor III

Databricks Repos best-practices recommend using the Repos REST API to update a repo via your git provider. The REST API requires authentication, which can be done one of two ways:

  1. A user / personal access token
  2. A service principal access token

Using a user access token authenticates the REST API as the user, so all repos actions are performed as the user identity. This isn't desirable for automation, as all automation tasks are tied to a specific user account. In this case, a service principal would be preferable. As far as I can tell, the service principal doesn't work in Azure DevOps, because the service principal doesn't have access to the Azure DevOps git repo.

Has anyone had success getting a service principal access to Azure DevOps? If not, what alternatives have people used to integrate Databricks Repos with Azure DevOps CI/CD (apart from using personal access tokens)?

27 REPLIES 27

rsenjins
New Contributor III

How did you solve this? Where did you find a way to create PAT tokens for Service Principals? The other comments don't make it that clear either for me

terrymunro
New Contributor II

Unfortunately I didn't find any solution to this. 🙁

rsenjins
New Contributor III

Did you manage to run the workflow integrated with Git with a service principal eventually? 

Getting this when using Postman on api/2.0/token-management/on-behalf-of/tokens

 

{
    "error_code": "ENDPOINT_NOT_FOUND",
    "message": "No API found for 'GET /token-management/on-behalf-of/tokens'"
}

 

 Using the databricks cli (databricks token-management create-obo-token) I get the following error

 

Error: On-behalf-of token creation for service principals is not enabled for this workspace

 

xiangzhu
Contributor

Hello,

My use case is to create dbt task inside of a databricks workflow.

It needs to specify `git_source`, and my workflow is run under a service principal account.

Unfortunately, from the beginning of the workflow run. an error is raised like:

"Failed to checkout Git repository: PERMISSION_DENIED: Encountered an error with your Azure Active Directory credentials. Please try logging out of Azure Active Directory (https://portal.azure.com) and logging back in."

But I've no where to grant to the service principal the Azure DevOps checkout permission.

Replacing the service principal with an standard user account works, but we can not use user account in production.

Anonymous
Not applicable

Are you able to use Azure DevOps Personal Access token for this instead of Active Directory credentials? Using DevOps PATS directly should work with service principals.

yes, I tried adding PAT in the git_source, it doesn't work neither.

I'm sure the PAT and the URL is OK, as I can `git clone (git_source url value with PAT inside)` locally without any issue.​

tested with runtime: 10.4.x-cpu-ml-scala2.12

171499
New Contributor III

@Vaibhav Sethi​ "first add the Git PAT token for the service principal via the Git Credential API. " - is this possible? On the Git Credentials API guide I only see the option to set the credentials for the current user (not a service principal)

yes, it is. you could find a quick tuto from this link:

Repos configuration for Azure Service Principal (databricks.com)

The link is broken

bradleyjamrozik
New Contributor III

I am also having this issue. 

bradleyjamrozik_0-1692820702304.pngbradleyjamrozik_1-1692820717977.png

 

camilo_s
New Contributor II

Since last year, Azure offers workload identity federation, which can be used to to authenticate SPs against Azure DevOps using Microsoft Entra ID. Any chances Databricks might leverage this to implement OAuth for authenticating service principals against Azure DevOps?

martindlarsson
New Contributor III

Having the exact same problem. Did you find a solution @michael_mehrten ?

In my case Im using a managed identity so the solution some topics suggest on generating an access token from a Entra ID service principal is not applicable.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.