Databricks Community

Stellar · ‎02-19-2024

Hi all,

I am looking for advice on what would be the best approach when it comes to CI/CD in Databricks and repo in general. What would be the best approach; to have main branch and branch off of it or? How will changes be propagated from dev to qa and then from qa to prod? Jobs will run notebooks from git? Only dev workspace will be connected to git?

Any pointers, advice, help is more than welcomed!

AndriusVitkausk · ‎09-03-2024

We've encountered issues with merge conflicts when pulling updates into the workspace with Repos API in our release pipeline which by the looks of it can only be resolved through the UI and not through Repos API itself. There's some lack of functionality there from the looks of it.

Example we use "databricks repos update" with folder directory and branch preset, but in cases when there are conflicts with the local version we end up getting "Error: Conflict pulling from remote" with no options on how to override this to take the incoming version. So would suggest exploring the possibility of configuring your jobs to run off remote code that's directly in your repo rather than the code in the workspace to avoid this (if using jobs/workflows)

Also on the job definitions check out asset bundles which can be used to deploy workflows across workspaces with different configs per environment if wanted. But keep in mind that workflows do not directly integrate into git anywhere, and there's no isolation from other developers making parallel changes like you have with notebooks under your private directory. So to have workflow definitions version controled you need to essentially make the desired changes to the dev workflow, take the json / yml source code and manually save it somewhere in your repo. That can then be used from your cicd to run the release across environments

Databricks Community

Databricks CI/CD Azure Devops

Connect with Databricks Users in Your Area

Databricks Named a Leader in the 2024 Gartner® Magic Quadrant™ for Cloud Database Management Systems

Announcing the new Meta Llama 3.3 model on Databricks

Milestone: DatabricksTV Reaches 100 Videos!

Dotmatics and Databricks Partner to Advance Scientific Intelligence in Life Sciences

Databricks Community Champion - December 2024 - Sujesh Menon