Databricks Community

akshaym0056 · ‎02-12-2025

I'm working with Databricks Asset Bundles and need to define constants at the bundle level based on the target environment. These constants will be used inside Databricks notebooks.

For example, I want a constant gold_catalog to take different values depending on the target:

For dev target → gold_dev01 For uat target → gold_tst01 For prod target → gold_prod01 Instead of setting these values using task-level parameters inside workflows or jobs, I’d prefer to define them at the bundle level, so they can be accessed globally within the asset bundle and used inside notebooks.

My Questions:

Is there a way to define such constants at the bundle level in Databricks Asset Bundles?
What is the best way to achieve this without defining task-level parameters?
Would using environment variables, bundle config files, or another approach be recommended to access them in Databricks notebooks?

Any guidance or best practices would be appreciated!

mark_ott · 4 weeks ago

There is currently no explicit, built-in mechanism in Databricks Asset Bundles (as of 2024) for directly defining global, environment-targeted constants at the bundle level that can be seamlessly accessed inside notebooks without using job or task parameters. However, there are several best-practice approaches for achieving this goal in a maintainable and scalable way.

Options for Environment-Specific Constants

1. Environment Variables (Recommended)

You can define environment variables per target environment within your Asset Bundle configuration (bundle.yaml or deployment configuration files). These environment variables can then be accessed in your notebooks via the os.environ API in Python (or similar methods in other languages).

How to do it:

In your bundle.yaml or target-specific YAML, define environment variables under the env key:

text

targets: dev: env: GOLD_CATALOG: gold_dev01 uat: env: GOLD_CATALOG: gold_tst01 prod: env: GOLD_CATALOG: gold_prod01
In your Databricks notebook (Python):

python

import os gold_catalog = os.environ.get("GOLD_CATALOG")
This way, the correct value is injected based on the bundle target at deployment.

Pros:

Central, DRY configuration.
Fully supported and documented pattern.
Decouples code from environment specifics.

2. Configuration Files (Alternative)

Define a .json or .yaml file in your repository that maps environment names to constants (e.g., bundle_constants.yaml), and have each notebook read this file at runtime, selecting the value based on the current environment.

How to do it:

Store values in a config file, e.g.:

text

gold_catalog: dev: gold_dev01 uat: gold_tst01 prod: gold_prod01
In your notebook, detect the environment (often via another environment variable or Databricks utilities) and use the mapping.

Cons:

Requires you to establish how notebooks know "which environment" they're running in (can be awkward).
More code overhead than environment variables.

3. Databricks Utilities (dbutils.widgets, dbutils.jobs.taskValues)

While convenient for runtime parameterization, these are more job/task-oriented and contrary to your wish of avoiding job/task parameter passing.

Official Recommendation

Databricks recommends using environment variables defined at the bundle or target level for environment-specific parameters when using Asset Bundles. This aligns with best practices for configuration management and works seamlessly across your notebooks and workflows.

Summary Table

Approach	Ease of Use	Works with Notebooks	Environment-Aware	Comment
Environment Vars	High	Yes	Yes	Officially recommended, DRY, accessible in code
Config Files	Medium	Yes	Needs extra step	Useful for more complex mappings
Widgets/Task Vars	Low	Yes	Yes	Job/task-level only, less DRY