โ08-01-2025 02:47 AM
Hi all,
I am working with databricks asset bundle and want to separate environment-specific job params (for example, for "env" and "dev") for each pipeline within my bundle. I need each pipeline to have its own job params values for different environments in separate files, rather than defining them inside the job yaml file itself.
โ08-01-2025 03:15 AM
Hey @azam-io , you can define the variables in you databricks.yml file for each target (you can define several for each env).
bundle:
name: your-name
uuid: id
include:
- resources/jobs/*.yml
- resources/experiments/*.yml
- resources/dashboards/*.yml
- resources/clusters/*.yml
targets:
dev:
mode: development
default: true
workspace:
host: "host-link-dev"
root_path: "path"
run_as:
service_principal_name: spn
variables:
catalog: dev_catalog
schema: dev_schema
prod:
mode: production
workspace:
host: "host-link-prd"
root_path: "path"
run_as:
service_principal_name: spn
variables:
catalog: prd_catalog
schema: prd_schema
variables:
catalog:
description: Catalog name.
default: dev_catalog
schema:
description: Schema name.
default: dev_schemaThen, in your pipelines or other resources yml simply refer to the variables with:
${var.catalog}
โ08-03-2025 10:17 PM
Hi Fede, thanks for your response. What Iโm aiming for is to keep the variables separated by job and, within each job, by environment. For example, I envision a folder structure under the resources directory where each job has its own folder, and inside that folder, there are separate files for the main job, development parameters, and production parameters, etc.
โ08-04-2025 12:14 AM
hello @azam-io ,
from what I know variables need to be defined in the databricks.yml file (never tried otherwise to be fair). Since you still want your variables to be environment dependent, I believe there are a few options.
One could be using dotenv files, or pointing at some other configurations (maybe in volumes) where you can store the parameters and you read the file in your job.
Or keeping the structure you envision: define all the variables for all your jobs, maybe you can leverage complex variables:
variables:
job_x_params:
description: 'My job params'
type: complex
default:
param1: 'value1'
param2: 'value2'
param3:
param3.1: true
param3.2: falseThen you can store a variable-overrides.json file for each environment. There's an example of this implementation in this other thread: Solved: Re: How to use variable-overrides.json for environ... - Databricks Community - 125126
In my view I think it can be quite hard to manage if the number of jobs and parameters increases... Probably storing job parameters in configuration files would be cleaner, and the asset bundles variables can be the path to those files.
Hope this could help, otherwise maybe could you share some parameters example?
โ09-03-2025 07:21 AM
Hi azam-io, were you able to solve your problem?
Are you trying to have different parameters depending on the environment, or a different parameter value?
I think the targets would allow to specify different parameters per environment / target.
As for the parameter values, I have solved this problem by using variables. I have a config file which I read before running databricks cli, this converts configured values into environmental variables, and I use them to set variables when executing `databricks bundle`. That, unfortunately, very quickly becomes a lot to do and difficult not to miss things, but to solve that I wrapped it all in a Drone CI pipeline. Now I can run the deployment locally with a single command which behind the scenes runs validation, deployment, and on top of that a few extra steps - for example in dev deployment it automatically starts the pipeline.
Passionate about hosting events and connecting people? Help us grow a vibrant local communityโsign up today to get started!
Sign Up Now