cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Pass Notebook parameters dynamically in Job task.

Raj_DB
New Contributor III

Hi Everyone, 

I'm working on scheduling a job and would like to pass parameters that I've defined in my notebook. Ideally, I'd like these parameters to be dynamic meaning that if I update their values in the notebook, the scheduled job should automatically use the latest values. Is there a way to achieve this or any known workaround?

These are my parameters in notebook. 

Raj_DB_0-1756383510542.png

I am following all this parameter in all my notebook.

Thanks for your help!

1 ACCEPTED SOLUTION

Accepted Solutions

szymon_dybczak
Esteemed Contributor III

 

Hi @Raj_DB  ,

Yep, you just need to use task values. They let you pass arbitrary values between tasks in a Databricks job.

So, for instance in your notebook you can define values you want to pass to your next job/task in following way. 

 

szymon_dybczak_0-1756393092383.png

 

Then in databricks workflow you can just past them to downstream job/task in following way. So, in above screenshot I defined catalog value that now I can pass to the job parameter:

 

 

{{tasks.ReadConfig.values.catalog}} 

 

szymon_dybczak_0-1756393527142.png

 

 

 Use task values to pass information between tasks | Databricks on AWS

 

 

View solution in original post

9 REPLIES 9

Coffee77
New Contributor II

Why not putting them extra code in notebook to handle job input parameters and then assign notebook default values based on some custom rule ๐Ÿ™‚ As far as I know, no built-in feature to achieve your goal. 

https://www.youtube.com/@CafeConData

Raj_DB
New Contributor III

Hi @Coffee77 , Thank you for the response. Can you suggest me how to proceed? as I am new to this environment.

szymon_dybczak
Esteemed Contributor III

 

Hi @Raj_DB  ,

Yep, you just need to use task values. They let you pass arbitrary values between tasks in a Databricks job.

So, for instance in your notebook you can define values you want to pass to your next job/task in following way. 

 

szymon_dybczak_0-1756393092383.png

 

Then in databricks workflow you can just past them to downstream job/task in following way. So, in above screenshot I defined catalog value that now I can pass to the job parameter:

 

 

{{tasks.ReadConfig.values.catalog}} 

 

szymon_dybczak_0-1756393527142.png

 

 

 Use task values to pass information between tasks | Databricks on AWS

 

 

Nice ๐Ÿ˜€

https://www.youtube.com/@CafeConData

Thank you @szymon_dybczak , I will definitely try. I hope it will work.

szymon_dybczak
Esteemed Contributor III

No problem @Raj_DB , it should work. I'm using this approach to dynamically pass parameters on the current project I'm part of ๐Ÿ™‚

Thank you so much @szymon_dybczak , it worked perfectly!. I'm also exploring the idea of maintaining a single notebook to pass parameters and reusing it across different jobs. Do you think that would be feasible with this approach, especially considering each notebook might require different parameters? I'd really appreciate any suggestions you might have.

ck7007
New Contributor II

I see you're using dbutils.widgets. text and dropdownโ€”perfect! You're already on the right track.

Quick Solution

Your widgets are already dynamic! Just pass parameters in your job configuration:

In your notebook (slight refactor of your code):

# Define widgets with defaults
dbutils.widgets.text("Month_refresh", "3")
dbutils.widgets.dropdown("Save_environment", "preprod", ["preprod", "prod"])
dbutils.widgets.dropdown("Save_Layer", "silver", ["bronze", "silver", "gold"])
dbutils.widgets.text("Save_folder", "Test/SalesData")

# Use the widget values
month = dbutils.widgets. get("Month_refresh")
env = dbutils.widgets.get("Save_environment")
layer = dbutils.widgets. get("Save_Layer")
folder = dbutils.widgets.get("Save_folder")

In your job configuration:

  1. Go to Workflows โ†’ Create Job
  2. Add your notebook as a task
  3. Under "Parameters," add:
    {
    "Month_refresh": "6",
    "Save_environment": "prod",
    "Save_Layer": "gold",
    "Save_folder": "Prod/SalesData"}

    These job parameters will override your notebook defaults!

    Pro Tip: Environment-based Defaults

    # Auto-detect environment
    is_prod = spark.catalog.currentCatalog() == "prod_catalog"
    default_env = "prod" if is_prod else "preprod"
    default_layer = "gold" if is_prod else "silver"

    dbutils.widgets.dropdown("Save_environment", default_env, ["preprod", "prod"])
    dbutils.widgets.dropdown("Save_Layer", default_layer, ["bronze", "silver", "gold"])

    This way, your scheduled jobs automatically adapt to the environment they run in.

    Is this what you were looking for, or did you need the parameters to update without touching the job configuration?

Coffee77
New Contributor II

That would work indeed ๐Ÿ™‚ However, solution provided by @szymon_dybczak is really clean ๐ŸŽฏ In your code if you have separated workspaces by environment, I would suggest to get current environment based on "current workspace" or "environment variables" injected in all "job" or "all-purpose" clusters where you can store your custom environment names. You can do this via DAB along with Databricks CLI scripts. Take into account that you can have multiple catalogs per environment as is my use case ๐Ÿ™‚

https://www.youtube.com/@CafeConData

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local communityโ€”sign up today to get started!

Sign Up Now