cancel
Showing results for 
Search instead for 
Did you mean: 
Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.
cancel
Showing results for 
Search instead for 
Did you mean: 

Updating databricks git repo from github action - how to

NielsMH
New Contributor III

Hi

My company is migrating from azuredevops to github and we have a pipeline in azuredevops which updates/syncs databricks repos whenever a pull request is made to the development branch. The azure devops pipeline (which works) looks like this:

 

trigger:
- development

pool:
vmImage: ubuntu-latest

steps:
- task: AzureKeyVault@2
inputs:
azureSubscription: 'Azure Service Connection'
KeyVaultName: $(keyvault_name)
SecretsFilter: '*'
RunAsPreJob: true
- task: Bash@3
inputs:
targetType: 'inline'
script: |
echo "Setup Databricks environmental variables (to be able to autoconfig databricks-cli)"
export DATABRICKS_HOST=$(DATABRICKSHOST)
export DATABRICKS_TOKEN=$(DATABRICKSTOKEN)
echo ${#DATABRICKS_HOST}
echo Install Databricks CLI
pip install databricks-cli
echo Update Repo
databricks repos update --repo-id $(repo_id) --branch $(branch)


I have rewritten it to fit a github action (i basically omitted fetching secrets from azure keyvault, and call them directly from the github secret env instead) like this:

 

name: Sync to Databricks Repo

on:
push:
branches:
- development
workflow_dispatch:

jobs:
sync-to-databricks:
runs-on: ubuntu-latest

env:
DATABRICKS_HOST: ${{ secrets.DATABRICKS_HOST }}
DATABRICKS_TOKEN: ${{ secrets.DATABRICKS_TOKEN }}

steps:
- name: Check out repository
uses: actions/checkout@v3

- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.x'

- name: Install and Configure Databricks CLI
run: |
echo "Setup Databricks environmental variables"
echo "Host length: ${#DATABRICKS_HOST}"

echo "Installing Databricks CLI"
pip install databricks-cli

echo "Updating Repo to development branch"
databricks repos update --repo-id ${{ secrets.REPO_ID }} --branch ${{ secrets.BRANCH }}
However, i run into authorization issues. The repo i try to update is in a Standard Databricks Workspace (not Premium). The git provider i use is "Github" and the connection i have set up to the github repository is via a Github Access Token with all relevant permissions (repo, admin, etc). The DATABRICKS_TOKEN i refer to is a Databricks Token i have created for my workspace admin user. I have 2xchecked all secrets, so i cant for the life of me not figure out why it goes wrong.  

I can run "databricks repos list" with the same credentials from the terminal without problems, but if i run "databricks repos update 12345xxx --branch development" i receive "Error: Missing Git provider credentials. Go to User Settings > Git Integration to set up your Git credentials." even though i can pull etc. from the databricks repo UI with the same tokens. I get same error when i try to use the databricks API with curl.

Can anyone help me out here?  

0 REPLIES 0

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group