I have a scenario where my Databricks asset bundles require two types of job parameters:
- Common parameters that apply to all environments
- Environment-specific parameters that differ per environment.
My current YAML setup is structured like this:
# common_parameters.yaml
common_parameters: &common_parameters
- name: job_var
default: "common job var value"
- name: retry_count
default: 3 # example common param
- name: timeout_seconds
default: 600 # example common param
# env_params.yaml
dev_parameters: &dev_parameters
- name: redshift_host
default: ${var.redshift_host}
- name: test_param_dev # new param for testing in dev
default: "test value dev"
prod_parameters: &prod_parameters
- name: redshift_host
default: ${var.redshift_host}
- name: test_param_prod # new param for testing in prod
default: "test value prod"
# notifications.yaml
dev_notifications: &dev_notifications
email_notifications:
on_failure:
- "dev-team@example.com"
prod_notifications: &prod_notifications
email_notifications:
on_failure:
- "prod-alerts@example.com"
# schedules.yaml
prod_schedule: &prod_schedule
schedule:
quartz_cron_expression: "0 0 0 * * ?"
timezone_id: "UTC"
pause_status: "PAUSED"
# main job definition and targets
resources:
jobs:
dummy_job:
name: "dummy_workflow"
edit_mode: "EDITABLE"
tasks:
- task_key: ingestion
notebook_task:
notebook_path: ../src/ingestion/test.py
source: "WORKSPACE"
targets:
dev:
resources:
jobs:
dummy_job:
parameters: *dev_parameters
prod:
resources:
jobs:
dummy_job:
parameters: *prod_parameters
How do I include or merge the common parameters with the environment-specific parameters within the parameters list and is there any best practices to manage this kind of parameter reuse in Databricks asset bundle YAMLs?