cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

How do I efficiently manage common and environment-specific job parameters in DABs

azam-io
New Contributor II

I have a scenario where my Databricks asset bundles require two types of job parameters:

  • Common parameters that apply to all environments
  • Environment-specific parameters that differ per environment.

My current YAML setup is structured like this:

# common_parameters.yaml
common_parameters: &common_parameters
  - name: job_var
    default: "common job var value"
  - name: retry_count
    default: 3 # example common param
  - name: timeout_seconds
    default: 600 # example common param

# env_params.yaml
dev_parameters: &dev_parameters
  - name: redshift_host
    default: ${var.redshift_host}
  - name: test_param_dev # new param for testing in dev
    default: "test value dev"

prod_parameters: &prod_parameters
  - name: redshift_host
    default: ${var.redshift_host}
  - name: test_param_prod # new param for testing in prod
    default: "test value prod"

# notifications.yaml
dev_notifications: &dev_notifications
  email_notifications:
    on_failure:
      - "dev-team@example.com"

prod_notifications: &prod_notifications
  email_notifications:
    on_failure:
      - "prod-alerts@example.com"

# schedules.yaml
prod_schedule: &prod_schedule
  schedule:
    quartz_cron_expression: "0 0 0 * * ?"
    timezone_id: "UTC"
    pause_status: "PAUSED"

# main job definition and targets
resources:
  jobs:
    dummy_job:
      name: "dummy_workflow"
      edit_mode: "EDITABLE"
      tasks:
        - task_key: ingestion
          notebook_task:
            notebook_path: ../src/ingestion/test.py
            source: "WORKSPACE"

targets:
  dev:
    resources:
      jobs:
        dummy_job:
          parameters: *dev_parameters

  prod:
    resources:
      jobs:
        dummy_job:
          parameters: *prod_parameters

How do I include or merge the common parameters with the environment-specific parameters within the parameters list and is there any best practices to manage this kind of parameter reuse in Databricks asset bundle YAMLs?

 

0 REPLIES 0