Hi
My company is migrating from azuredevops to github and we have a pipeline in azuredevops which updates/syncs databricks repos whenever a pull request is made to the development branch. The azure devops pipeline (which works) looks like this:
trigger:
- development
pool:
vmImage: ubuntu-latest
steps:
- task: AzureKeyVault@2
inputs:
azureSubscription: 'Azure Service Connection'
KeyVaultName: $(keyvault_name)
SecretsFilter: '*'
RunAsPreJob: true
- task: Bash@3
inputs:
targetType: 'inline'
script: |
echo "Setup Databricks environmental variables (to be able to autoconfig databricks-cli)"
export DATABRICKS_HOST=$(DATABRICKSHOST)
export DATABRICKS_TOKEN=$(DATABRICKSTOKEN)
echo ${#DATABRICKS_HOST}
echo Install Databricks CLI
pip install databricks-cli
echo Update Repo
databricks repos update --repo-id $(repo_id) --branch $(branch)
I have rewritten it to fit a github action (i basically omitted fetching secrets from azure keyvault, and call them directly from the github secret env instead) like this:
name: Sync to Databricks Repo
on:
push:
branches:
- development
workflow_dispatch:
jobs:
sync-to-databricks:
runs-on: ubuntu-latest
env:
DATABRICKS_HOST: ${{ secrets.DATABRICKS_HOST }}
DATABRICKS_TOKEN: ${{ secrets.DATABRICKS_TOKEN }}
steps:
- name: Check out repository
uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.x'
- name: Install and Configure Databricks CLI
run: |
echo "Setup Databricks environmental variables"
echo "Host length: ${#DATABRICKS_HOST}"
echo "Installing Databricks CLI"
pip install databricks-cli
echo "Updating Repo to development branch"
databricks repos update --repo-id ${{ secrets.REPO_ID }} --branch ${{ secrets.BRANCH }}
However, i run into authorization issues. The repo i try to update is in a Standard Databricks Workspace (not Premium). The git provider i use is "Github" and the connection i have set up to the github repository is via a Github Access Token with all relevant permissions (repo, admin, etc). The DATABRICKS_TOKEN i refer to is a Databricks Token i have created for my workspace admin user. I have 2xchecked all secrets, so i cant for the life of me not figure out why it goes wrong.
I can run "databricks repos list" with the same credentials from the terminal without problems, but if i run "databricks repos update 12345xxx --branch development" i receive "Error: Missing Git provider credentials. Go to User Settings > Git Integration to set up your Git credentials." even though i can pull etc. from the databricks repo UI with the same tokens. I get same error when i try to use the databricks API with curl.
Can anyone help me out here?