cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Extending DevOps Service Principal support?

krucial_koala
New Contributor III

As per the previous discussion:

The recommendation was to create a DevOps PAT for the Service Principal and upload it to Databricks using the Git Credential API. The main flaw with this approach being that PATs must be rotated.

The DevOps team recently announced availability of a new capability: "Service principals and managed identities provide an exciting new alternative to personal access tokens"

https://devblogs.microsoft.com/devops/introducing-service-principal-and-managed-identity-support-on-...

Will Databricks support this feature? At the moment, if I run a workflow job with a Service Principal which has access to the DevOps repo I get this error message:

AAD auth error

5 REPLIES 5

Debayan
Esteemed Contributor III
Esteemed Contributor III

Hi,

Git repo PAT can only be generated for an user (as in this guide https://learn.microsoft.com/en-us/azure/devops/organizations/accounts/use-personal-access-tokens-to-...) but not for a service principal. You will need to use some user’s Git PAT (your own PAT in Azure DevOps should work) for the Service Principal when calling the Git Credentials API:https://stackoverflow.com/questions/72256036/azure-databricks-api-cannot-add-repos-using-service-pri...

Please let us know if this helps. Also, please tag @Debayan​ with your next comment so that I will get notified. Thank you!

Hi @Debayan Mukherjee​, thanks for getting back to me.

Microsoft recommend not using PATs where possible as:

However, using an authentication method tied to a single person also means relying on a single point-of-failure. When a user leaves the company, the PAT driving the team application will become inaccessible to all other team members

They also say:

Additionally, PATs are bearer tokens, which can be leaked easily and fall into the wrong hands. ... we welcome you to explore service principals and managed identities instead.

Based on the risks of users leaving, and token leakage, we have a company policy which limits PAT lifetime to 90 days.

These attributes make it difficult to put a solution into production.

DevOps now supports accessing services without using a PAT, so presumably Databricks could request a bearer token for the Service Principal running the job, from Azure AD?

Debayan
Esteemed Contributor III
Esteemed Contributor III

Hi, For Azure AD tokens for service principles,

  • Define a service principal in Azure Active Directory and then get an Azure AD access token for that service principal instead of for a user. You configure the service principal as one on which authentication and authorization policies can be enforced in Azure Databricks. Service principals in an Azure Databricks workspace can have different fine-grained access control than regular users (user principals).

Reference: https://learn.microsoft.com/en-us/azure/databricks/dev-tools/api/latest/aad/

Also, to note, As a security best practice, when authenticating with automated tools, systems, scripts, and apps, Databricks recommends you use access tokens belonging to service principals instead of workspace users. To create access tokens for service principals, see Manage access tokens for a service principal.

For managing PATs, you can refer: https://learn.microsoft.com/en-gb/azure/databricks/administration-guide/access-control/tokens

Please let us know if this helps. Also, please tag @Debayan​ with your next comment so that I will get notified. Thank you!

cKunal
New Contributor II

Hi @Debayan . After searching a lot I finally stumbled upon your response, however I still have some questions. I am trying to install a package (Flask==2.0.2) from my Azure Devops portal using Databricks. For this purpose I am using a PAT and passing this in the %pip install statement in databricks. Now I have created a Service principal and used the same in my `Service Connections` in Azure Devops, but while using this, I am not able to run my pip install. I have scoured the net for a possible solution. Can you please help.

Anonymous
Not applicable

Hi @James Baxter​ 

Thank you for posting your question in our community! We are happy to assist you.

To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your question?

This will also help other community members who may have similar questions in the future. Thank you for your participation and let us know if you need any further assistance! 

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.