04-30-2024 07:28 AM
Hi,
I have cloned a public git repo into my Databricks account. It's a repo associated with an online training course. I'd like to work through the notebooks, maybe make some changes and updates, etc., but I'd also like to keep a clean copy of it.
My preferred strategy would be to do all my work on a new branch, keeping `master` unaltered. However, Databricks refuses the branch create instruction, saying "Invalid Git provider credentials", which I infer is about my lacking credentials to write back to the public repo.
What strategy will work for my intention? I had expected that since the clone is in my Databricks account, it shouldn't mind my creating a branch, even if I don't have credentials to ever push it back to the original source.
Thanks.
04-30-2024 05:05 PM
Hi DavidKxx,
You can clone public remote repositories without Git credentials (a personal access token and a username). To modify a public remote repository or to clone or modify a private remote repository, you must have a Git provider username and PAT with Write (or greater) permissions for the remote repository.
In github documentation, it is mentioned Note: You can only create a branch in a repository to which you have push access.
So, it is a restriction from github really. Hope this helps.
Thanks!
05-06-2024 06:09 AM
I don't see how that addresses the issue. The page you linked to seems to be talking about creating a branch on github, which is not what I'm trying to do. I want to create a branch only in my local clone of the repo -- I don't care about ever being able to push it back to github.
Here's another view of the problem. I opened a `git bash` window on my Windows computer and cloned the same repo to a new local repo, and jumped inside the repo with `cd`. At this point, I find that creating a new branch from the command line works successfully, and thus the new branch exists and is checked out. My lack of github credentials to push to the original repo is no obstacle to that.
But as noted in the original question, Databricks won't let me do that on the repo inside Databricks. Does that mean that Databricks is doing some kind of invisible little push of creating a new branch, and that's when the lack of credentials causes the the branch create to fail?
05-21-2024 02:43 AM
Hi @DavidKxx ,
I am not sure I really understand your issue. Can you please provide me the steps you are following to run into this issue so that I can repro it at my end.
I referred to the doc, and it appears what you intend is possible. Just referencing the relevant doc on creating a branch - https://docs.databricks.com/en/repos/git-operations-with-repos.html#create-a-new-branch
05-21-2024 07:07 AM
Steps to reproduce:
In summary, Databricks won't let me create a new branch in the local repo.
In contrast, as I noted in my last post, I can create a new branch in the local repo if I do it through the `git bash` command line, by the following sequence of commands:
git clone https://github.com/bradyneal/causal-book-code.git
cd causal-book-code
cd git checkxxx -b test
(but replace `xxx` with `out` --- I had to substitute `xxx` to avoid using a full word that this commenting interface forbids)
This results in a new branch `test` having been created and checked out in the local repo. Therefore, it's generically possible to create a new branch locally in a repo to which I have no Github write permissions, and the fact that I can't do it in Databricks appears to be a Databricks issue or design choice.
05-21-2024 10:12 AM
I get your issue, @DavidKxx. Until we do a git push on command line we do not see the Authentication failed
git push origin test
While in the Databricks UI, we fail early(screenshots below). We require the Databricks GitHub App as mentioned here to provide us access to the repository.
By reading the doc on github apps it appears, this is how "apps" work. They have set permissions that are required.
I looked at the error from the request, it is PERMISSION_DENIED
{
"data": {
"projectsProjectGitCheckout": {
"branch": null,
"apiError": {
"code": "PERMISSION_DENIED",
"message": "Link to GitHub account does not have access. To fix this error:\n1. GitHub user bradyneal should go to https://github.com/apps/databricks/installations/new and install the app on the account (bradyneal) to allow access.\n2. If user bradyneal already installed the app and they are using scoped access with the 'Only select repositories' option, they should ensure they have included access to this repository by selecting it.\nRefer to https://docs.databricks.com/en/repos/get-access-tokens-from-git-provider.html#link-github-account-using-databricks-github-app for more information. If the problem persists, please file a support ticket.",
"__typename": "ApiError"
},
"__typename": "ProjectsProjectGitCheckoutResponse"
}
}
}
Kindly let me know if you think otherwise. Why it is working on cmd but not apps is another question to discuss as it may not be possible to implement the permissions check in command line - as branch is still local.
07-09-2024 08:30 AM
It occurs to me that one valid solution to this problem is simply to fork the repo and work there. Pretty standard approach, I guess, although not something I've ever been in the habit of doing.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group