cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

How to Define Constants at Bundle Level in Databricks Asset Bundles for Use in Notebooks?

akshaym0056
New Contributor
I'm working with Databricks Asset Bundles and need to define constants at the bundle level based on the target environment. These constants will be used inside Databricks notebooks.

For example, I want a constant gold_catalog to take different values depending on the target:

For dev target → gold_dev01 For uat target → gold_tst01 For prod target → gold_prod01 Instead of setting these values using task-level parameters inside workflows or jobs, I’d prefer to define them at the bundle level, so they can be accessed globally within the asset bundle and used inside notebooks.

My Questions:

  1. Is there a way to define such constants at the bundle level in Databricks Asset Bundles?
  2. What is the best way to achieve this without defining task-level parameters?
  3. Would using environment variables, bundle config files, or another approach be recommended to access them in Databricks notebooks?

Any guidance or best practices would be appreciated! 

2 REPLIES 2

NandiniN
Databricks Employee
Databricks Employee

Checking.

mark_ott
Databricks Employee
Databricks Employee

Yes, you can define environment-specific constants at the bundle level in Databricks Asset Bundles and make them accessible inside Databricks notebooks, without relying on task-level parameters. This can be done using environment variables, bundle configuration files, or by leveraging bundle parameters or environment spec features in Databricks Asset Bundles. Each approach has its own pros and cons regarding maintainability and accessibility.

Bundle-Level Constants in Databricks Asset Bundles

1. Using Bundle Config Files (bundle.yaml and environments Section)

Databricks Asset Bundles support environment-specific configuration in the bundle.yaml (or sometimes databricks.yml) file under the environments key. This allows you to define settings/variables per environment.

Example bundle.yaml:

text
environments: dev: variables: gold_catalog: gold_dev01 uat: variables: gold_catalog: gold_tst01 prod: variables: gold_catalog: gold_prod01

These variables can then be accessed in Databricks jobs and workflows. To use them inside notebooks, you might need to pass them as environment variables or inject them via notebook initialization logic.

2. Accessing Bundle Variables in Notebooks

Inside your notebook, you can access environment variables using standard Python or Scala methods:

python
import os gold_catalog = os.environ.get("gold_catalog", "default_value")

To inject these environment variables, you may need to ensure your workflow or job configuration pulls the variables from your environment spec, or you can use Databricks notebook widgets or parameters if environment variables cannot be set natively by the job.

3. Bundle Parameter Access Patterns

If you use bundle parameters, you can inject values into jobs or workflows defined in the bundle, and then reference them from notebooks by using widgets or parameter passing at runtime. However, since you want this at the bundle/environment level and accessible globally, prefer using global variables via the environments config or environment variables.

4. Best Practices

  • Use environments and variables inside bundle.yaml for environment-specific constants for clarity and isolation between targets.

  • Ensure you have a consistent mechanism in your job/notebook to read these variables (widgets, environment variables, or a library that parses bundle.yaml).

  • Document all environment-specific variables in a dedicated section in your asset bundle source code repository for maintainability.

  • For complex scenarios, consider using an initialization script that reads the environment and loads the correct constants automatically upon notebook startup.

Recommendations

  • Environment Variables via bundle.yaml: This is the most maintainable and native approach. You can define variables in the environments section and inject them into the runtime context using the Databricks Asset Bundles features.

  • Config File in DBFS or Workspace: You may also store a config file (JSON/YAML/PARQUET) with constants per environment, read it at notebook startup, and set globals accordingly.

  • No Task-Level Parameters Needed: By utilizing environments and variables at the bundle level, you avoid having to define job/task-level parameters, making your configuration cleaner and more scalable.

References

  • Databricks Asset Bundles: Parameters

  • Databricks Asset Bundles: Environments and configuration best practices


Summary:
Define environment-specific constants at the bundle level by leveraging the environments and variables keys in your bundle configuration file (bundle.yaml). Access these constants within notebooks through environment variables or by reading from a config file, avoiding the use of task-level parameters for global settings. This approach is recommended for maintainable, reusable deployment of Databricks notebooks across multiple environments.