- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ05-06-2022 09:37 AM
I am making use of repos in databricks and am trying to reference the current git branch from within the notebook session.
For example:
from pygit2 import Repository
repo = Repository('/Workspace/Repos/user@domain/repository')
The code above throws an error stating that the repository cannot be found. Similar errors are thrown with GitPython as well. It seems to me that DataBricks Repos are configured in a way that means these packages cannot recognise them.
Does anyone have any experience of this?
Thanks
- Labels:
-
Databricks Repos
-
Git Integration
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ09-15-2022 05:55 AM
You cannot use this as far as i know, but you can put a workaround in a notebook if you are calling code from your repo via a notebook:
repo_path = "/Repos/xyz_repo_path/xyz_repo_name"
repo_path_fs = "/Workspace" + repo_path
repo_branch = "main"
def checkRepoInfo():
nb_context= json.loads(dbutils.notebook.entry_point.getDbutils().notebook().getContext().toJson())
api_url = nb_context['extraContext']['api_url']
api_token = nb_context['extraContext']['api_token']
db_repo_data = requests.get(f"{api_url}/api/2.0/repos", headers = {"Authorization": f"Bearer {api_token}"}).json()
for db_repo in db_repo_data["repos"]:
db_repo_id = db_repo["id"]
db_repo_path = db_repo["path"]
db_repo_branch = db_repo["branch"]
db_repo_head_commit = db_repo["head_commit_id"]
if db_repo["path"] == repo_path:
print ("Git commit info: ID: {} | Path: {} | Branch: {} | Commit: {}".format(db_repo_id, db_repo_path, db_repo_branch ,db_repo_head_commit))
assert db_repo_branch == repo_branch
checkRepoInfo()
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ05-24-2022 09:46 AM
I'm having the same issue. I couldn't see anything in the documentation that @Kaniz Fatmaโ posted which answers this question either.
It looks like the `.git/` subdirectory isn't actually present at the top level of the repo in databricks, which seems strange. I don't really understand why that would be and how git works in databricks without the `.git/` subdir ...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ06-06-2022 02:03 AM
Agreed, it seems very odd. @Kaniz Fatmaโ, are you able to assist any further on this? Is there somewhere in the linked documentation in particular that you believe would be helpful?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ07-28-2022 05:25 PM
Hi @Thomas Pileโ,
Just a friendly follow-up. Did you were able to find a solution or you still need help? please let us know.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ07-29-2022 06:15 AM
@Jose Gonzalezโ I cannot speak for @Thomas Pileโ but I am also struggling with this issue and have been unable to find a solution
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ07-29-2022 06:30 AM
Hi @Jose Gonzalezโ. I haven't been able to find a solution yet either. Are you able to help?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ09-15-2022 05:55 AM
You cannot use this as far as i know, but you can put a workaround in a notebook if you are calling code from your repo via a notebook:
repo_path = "/Repos/xyz_repo_path/xyz_repo_name"
repo_path_fs = "/Workspace" + repo_path
repo_branch = "main"
def checkRepoInfo():
nb_context= json.loads(dbutils.notebook.entry_point.getDbutils().notebook().getContext().toJson())
api_url = nb_context['extraContext']['api_url']
api_token = nb_context['extraContext']['api_token']
db_repo_data = requests.get(f"{api_url}/api/2.0/repos", headers = {"Authorization": f"Bearer {api_token}"}).json()
for db_repo in db_repo_data["repos"]:
db_repo_id = db_repo["id"]
db_repo_path = db_repo["path"]
db_repo_branch = db_repo["branch"]
db_repo_head_commit = db_repo["head_commit_id"]
if db_repo["path"] == repo_path:
print ("Git commit info: ID: {} | Path: {} | Branch: {} | Commit: {}".format(db_repo_id, db_repo_path, db_repo_branch ,db_repo_head_commit))
assert db_repo_branch == repo_branch
checkRepoInfo()

