03-08-2024 06:33 AM
Hi,
I am migrating from dbx to Databricks Asset Bundles (DAB) a deployment setup where I have specific parameters per environment. This was working well with dbx, and I am trying now to define those parameters defining targets (3 targets : dev, uat, prod). Each of these targets should use a different git branch to pull the code from. However adding this :
git:
branch: dev | uat | prod
It doesn't change anything, and no matter which target I deploy, the default branch specified in the main resource setup gets pulled :
git_source:
git_url: *****
git_provider: *****
git_branch: main
Does anyone have a solution to pull from a different git branch based on which target is deployed via databricks bundle deploy -t target ?
03-13-2024 04:19 AM
Hi @thibault, When working with Databricks Asset Bundles (DABs), you can indeed specify different git branches for different targets.
Let’s address this step by step:
Understanding Databricks Asset Bundles (DABs):
Defining Git Branches for Different Targets:
Bundle Configuration YAML:
The bundle configuration YAML allows you to define settings for your DAB.
Here’s an example of how you can set up git branches per target:
bundle:
name: MyDAB
# Other bundle settings...
targets:
dev:
git:
git_branch: dev
git_provider: <provider>
git_url: <git_url>
uat:
git:
git_branch: uat
git_provider: <provider>
git_url: <git_url>
prod:
git:
git_branch: prod
git_provider: <provider>
git_url: <git_url>
Replace <provider>
and <git_url>
with your actual Git provider and repository URL.
Explanation:
dev
, uat
, prod
) specifies its own git branch.databricks bundle deploy -t dev
), Databricks will use the corresponding git branch defined for that target.git_source
section in your main resource setup will be overridden by the target-specific configuration.Deploying with Different Git Branches:
databricks bundle deploy -t target
, Databricks will pull the code from the specified git branch associated with that target.Remember to replace placeholders (<provider>
and <git_url>
) with your actual Git provider details and repository URL. With this setup, you’ll be able to manage different git branches per environment using DABs23.
03-13-2024 06:09 AM
Kaniz, thank you for your response.
I have done exactly what you suggest (both with naming the section git and git_source) and I am getting this error:
Error: terraform apply: exit status 1
Error: cannot update job: Invalid use of git source in Python Task specification
The workflow consists of 3 tasks : 2 python tasks and 1 notebook task, all pulled from git.
03-13-2024 10:59 AM - edited 03-13-2024 11:06 AM
To add some details here, if I specify the git config under the overall resources: -> jobs, the deployment succeeds, but the git setups per branch do not override that global setup.
Also, if there is no git setup at the top level resources: -> jobs, despite it being defined at the target level, the bundle.tf.json produced with databricks bundle deploy does not contain any info about git sources, which is why the deployment fails.
03-21-2024 12:26 AM
All good now?
03-21-2024 02:14 AM
Nope, I had to revert back to using dbx, so I will postpone migration to DAB until I find examples that work.
03-28-2024 03:17 AM
Something must have changed in the meantime on Databricks side. I have only updated databricks CLI to 016 and now, using a git / branch under each target deploys this setup, where feature-dab is the branch I want the job to pull sources from, I see this :
But not sure what this means as I've never seen this in the Jobs UI, and the YAML produced still shows only the main branch as source. Do you know @Kaniz ? I can't find docs on this.