02-26-2024 11:36 AM
Hello,
My team and I are experimenting with bundles, we follow the pattern of having one main file Databricks.yml and each job definition specified in a separate yaml for modularization.
We wonder if it is possible to select from the main Databricks.yml which jobs resources are deploy per target. In specific, we have a job called test_<name of the application>, which contain all the unit testing and integration testing. Ideally, this test job would only be deploy in development alongside the rest of the resources, while in production the test job would be excluded.
Below an example of the Databricks.yml.
# yaml-language-server: $schema=..\..\bundle_config_schema.json
bundle:
name: app_name
include:
- resources/*.yml
- tests/test_job.yml
targets:
dev:
# We know this is not possible but ideally something like this would be brilliant
# include:
# - resources/*.yml
# - tests/test_job.yml
default: true
variables:
slack_web_hoook: 111111111222222222222222
catalog: catalog_name
storage_account_name: storege_account_name
mode: development
workspace:
host: https://adb-******.azuredatabricks.net
prod:
# We know this is not possible but something like this would be brilliant
# exclude:
# - "tests/*"
variables:
slack_web_hoook: 1111112222333344444
catalog: _catalog_name
storage_account_name: storage_account_name
mode: production
workspace:
host: https://adb-***************.azuredatabricks.net
root_path: /Shared/.bundle/prod/${bundle.name}
run_as:
user_name: user.user@emailIs there any alternative better than defining the whole job resource within this file?
03-11-2025 05:38 AM
Not that I'm aware of. I've solved it substituting the resources folder string from databricks.yml in the cicd configuration. Definitely not great.
03-11-2025 07:08 AM
Would you be able to share how to use the variable in the include? I can't seem to work it out
03-12-2025 02:50 AM
Again, this is not what I would recommend, and it's temporary, but this is how it looks like in databricks.yml :
include:
- ./$asset_folder/*.yml
And in the training section of my cicd pipeline :
- script: |
sed -i 's/\$asset_folder/resources_training/g' databricks.yml
workingDirectory: $(workingDirectory)
displayName: Define assets to be included in the bundle
05-13-2025 10:01 AM
Is there any update from databricks about this?
Tuesday
Experiencing the same issue. Solved partially by placing high level targets in the job yml file, but this only works if the job has to go only one environment. If this is for two environments, but not the third, there is no way to avoid duplicating this job. This is really inconvenient. Targets should have high level includes.
Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!
Sign Up Now