04-24-2024 03:00 AM
Is it possible to trigger a git reset --hard programatically?
I'm running a platform service where, as part of CI/CD, repos get deployed into the Databricks workspace. Normally, our developers work with upstream repos both from their local IDEs and from the Databricks Code Editor. It sometimes happens though that, due to their mixed (and sometimes inadequate) use of IDE and Databricks GUI, the repo deployment process fails during CICD (error code GIT_CONFLICT).
In that case, we'd like to catch this error and do a hard reset to overwrite the Databricks Repo with the status of the Upstream repo (which should be the single source of truth anyhow).
Is it possible to do this programmatically?
09-19-2024 06:35 PM
Hi from the Git folders/Repos PM:
We don't currently have a solution to hard reset programmatically. This is on our roadmap.
If you orchestrate production jobs using the Databricks workflows, there is an option to use version-controlled source code in a Databricks job.
This should eliminate the possibility of executing uncommitted changes. Is this an acceptable solution for you?
09-27-2024 10:29 AM - edited 09-27-2024 10:32 AM
Hi Gubbanoa,
Thank you for the suggestion. Can you try deleting the Git folder and recloning it in the workspace in the automation script instead? Let me know if this does not work for your workflow and would love to brainstorm on an alternative solution or advocate for this feature.
05-06-2024 06:26 AM
Hi @Retired_mod, thanks for your reply.
My question was about programmatically doing what Databricks does on the Databricks Repository when I click on the Reset (hard) option in the UI; apologies for not emphasizing this clearly enough.
I tracked the network activity in the browser when clicking on that button and noticed it calls an endpoint https://<WORKSPACE_URL>/graphql/projectGitReset_ProjectGitModal which unfortunately doesn't seem to be public (it's not documented anywhere).
I realize git reset --hard can be a destructive operation, but in our use-case remote repositories should be the single source of truth at any time, which is why it'd make sense to be able to programmatically hard reset a repo to a given branch when performing a repo update. Without this, an unstaged change in a Git repository will block Update a repo endpoint operations, which is what we'd like to avoid.
09-19-2024 06:35 PM
Hi from the Git folders/Repos PM:
We don't currently have a solution to hard reset programmatically. This is on our roadmap.
If you orchestrate production jobs using the Databricks workflows, there is an option to use version-controlled source code in a Databricks job.
This should eliminate the possibility of executing uncommitted changes. Is this an acceptable solution for you?
09-19-2024 11:11 PM
Hi @nicole_lu_PM,
We do use version-controlled source code in our Databricks jobs for some use-cases, but it too has some major shortcomings regarding PAT management and lifecycle for setting up Git credentials. I've documented them here and truly wish the engineering ergonomics improves there.
09-26-2024 12:04 AM
Not sure if we have that option: We are orchestrating jobs from azure-data-factory. Is there a way to check out a particular branch for a datafactory job from azure-data-factory?
If there is not, we still need the "databricks repo update --branch main --force" in our devops deploy script
09-27-2024 10:29 AM - edited 09-27-2024 10:32 AM
Hi Gubbanoa,
Thank you for the suggestion. Can you try deleting the Git folder and recloning it in the workspace in the automation script instead? Let me know if this does not work for your workflow and would love to brainstorm on an alternative solution or advocate for this feature.
09-25-2024 09:30 AM
Thank you for the feedback there! We recently added more docs for SP OAuth support for DevOps. SP OAuth support for Github is being discussed.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group