cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Can I use Git provider with using Service Principal in job

Yuki
New Contributor

Hi everyone,

I'm trying to use Git provider in Databricks job.

First, I was using my personal user account to `Run as`.

But when I change `Run as` to Service Principal, it was failed because of permission error.

And I can't find a way to solve it.

Could I achieve this settings?

Yuki_0-1699340000007.png

 

3 REPLIES 3

Kaniz
Community Manager
Community Manager

Hi @Yuki, Certainly! When using a Service Principal to run Databricks jobs and encountering permission errors with the Git provider, here are some steps you can take to troubleshoot and resolve the issue:

 

Confirm Git Integration Settings:

  • Go to User Settings > Linked accounts in Databricks.
  • Ensure that you have selected the correct Git provider (e.g., GitHub, GitLab, Bitbucket).
  • Enter both your Git provider username and personal access token (PAT).
  • Note that legacy Git integrations did not require a username, so you might need to add one for Databricks Repos.

Verify Repo Access:

  • Make sure your personal access token or app password has the correct access to the repository.
  • If your Git provider uses Single Sign-On (SSO), authorize your tokens for SSO.

Test with Git Command Line:

  • Use the Git command line to test your token:git clone https://<username>:<personal-access-token>@github.com/<org>/<repo-name>.git

Secure Connection (SSL) Problems:

  • If you encounter SSL problems, ensure that your Git server is accessible from Azure Databricks.

Timeout Errors:

  • Expensive operations (e.g., cloning large repos) might result in timeout errors. These operations could complete in the background.
  • Consider using sparse checkout for large repos.

404 Errors:

  • If you receive a 404 error when opening a non-notebook file, wait a few minutes and try again. There can be a delay between workspace enabling and webapp configuration updates.

Resolve Notebook Name Conflicts:

  • Conflicting notebook names (similar or identical filenames) can cause issues when creating a repo or pull request.
  • Ensure that folders do not contain notebooks with the same name as other notebooks, files, or folders (excluding file extensions).

Remember to adjust these settings based on your specific Git provider and repository configuration. With these steps, you should be able to successfully use the Git provider in your Databricks jobs! 🚀🔧

martindlarsson
New Contributor III

@Kaniz you mentioned doing this using a service principal in the head of your answer and then no instructions on how to do just that! How are one supposed to got to User Settings Linked accounts as a service principal?

martindlarsson
New Contributor III

The documentation is lacking in this area which should be easy to set up. Instead we are forced to search among community topics such as these.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.