โ03-08-2024 06:33 AM
Hi,
I am migrating from dbx to Databricks Asset Bundles (DAB) a deployment setup where I have specific parameters per environment. This was working well with dbx, and I am trying now to define those parameters defining targets (3 targets : dev, uat, prod). Each of these targets should use a different git branch to pull the code from. However adding this :
git:
branch: dev | uat | prod
It doesn't change anything, and no matter which target I deploy, the default branch specified in the main resource setup gets pulled :
git_source:
git_url: *****
git_provider: *****
git_branch: main
Does anyone have a solution to pull from a different git branch based on which target is deployed via databricks bundle deploy -t target ?
โ03-13-2024 04:19 AM
Hi @thibault, When working with Databricks Asset Bundles (DABs), you can indeed specify different git branches for different targets.
Letโs address this step by step:
Understanding Databricks Asset Bundles (DABs):
Defining Git Branches for Different Targets:
Bundle Configuration YAML:
The bundle configuration YAML allows you to define settings for your DAB.
Hereโs an example of how you can set up git branches per target:
bundle:
name: MyDAB
# Other bundle settings...
targets:
dev:
git:
git_branch: dev
git_provider: <provider>
git_url: <git_url>
uat:
git:
git_branch: uat
git_provider: <provider>
git_url: <git_url>
prod:
git:
git_branch: prod
git_provider: <provider>
git_url: <git_url>
Replace <provider>
and <git_url>
with your actual Git provider and repository URL.
Explanation:
dev
, uat
, prod
) specifies its own git branch.databricks bundle deploy -t dev
), Databricks will use the corresponding git branch defined for that target.git_source
section in your main resource setup will be overridden by the target-specific configuration.Deploying with Different Git Branches:
databricks bundle deploy -t target
, Databricks will pull the code from the specified git branch associated with that target.Remember to replace placeholders (<provider>
and <git_url>
) with your actual Git provider details and repository URL. With this setup, youโll be able to manage different git branches per environment using DABs23.
3 weeks ago
Hello Kaniz_Fatma,
I tried your suggestions. However, the tags: git_branch, git_url, and git_provider are not recognized in the bundle.yml (https://docs.databricks.com/en/dev-tools/bundles/settings.html#git)
They are valid in the resource.yml context, but they do not work as Thibault mentioned.
My bundle.yml
bundle:
name: alerts
include:
- resources/*.yml
targets:
prod:
git:
origin_url: *****
# provider: AWS CodeCommit
branch: master
mode: production
workspace:
host: ******
putting the following tags inside the resource.yml had no effect
git_source:
git_url: *********
git_provider: AWS CodeCommit
git_branch: master
I saw that was released a way to deploy a resource setting as source a Git provider (https://community.databricks.com/t5/data-engineering/databricks-asset-bundle-dab-from-a-git-repo/td-...). However, the video and files didn't clearly show how it was done. I would appreciate it if you could please clarify this part.
Best,
โ03-13-2024 06:09 AM
Kaniz, thank you for your response.
I have done exactly what you suggest (both with naming the section git and git_source) and I am getting this error:
Error: terraform apply: exit status 1
Error: cannot update job: Invalid use of git source in Python Task specification
The workflow consists of 3 tasks : 2 python tasks and 1 notebook task, all pulled from git.
โ03-13-2024 10:59 AM - edited โ03-13-2024 11:06 AM
To add some details here, if I specify the git config under the overall resources: -> jobs, the deployment succeeds, but the git setups per branch do not override that global setup.
Also, if there is no git setup at the top level resources: -> jobs, despite it being defined at the target level, the bundle.tf.json produced with databricks bundle deploy does not contain any info about git sources, which is why the deployment fails.
โ03-21-2024 12:26 AM
All good now?
โ03-21-2024 02:14 AM
Nope, I had to revert back to using dbx, so I will postpone migration to DAB until I find examples that work.
โ03-28-2024 03:17 AM
Something must have changed in the meantime on Databricks side. I have only updated databricks CLI to 016 and now, using a git / branch under each target deploys this setup, where feature-dab is the branch I want the job to pull sources from, I see this :
But not sure what this means as I've never seen this in the Jobs UI, and the YAML produced still shows only the main branch as source. Do you know @Kaniz_Fatma ? I can't find docs on this.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโt want to miss the chance to attend and share knowledge.
If there isnโt a group near you, start one and help create a community that brings people together.
Request a New Group