cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Passing Parameters Between Nested 'Run Job' Tasks in Databricks Workflows

Kaniz
Community Manager
Community Manager

Posting this on behalf of zaheer.abbas.

I'm dealing with a similar scenario as mentioned here where I have jobs composed of tasks that need to pass parameters to each other, but all my tasks are configured as "Run Job" tasks rather than directly running notebooks.

Hereโ€™s a breakdown of my setup:

  1. Job_A: This job acts as the master orchestrator for several other jobs.

    • Task_A (Type: Run Job): This task contains notebooks as its tasks, within which it sets certain task parameters.
      • Task_A_1_Notebook: It sets a task parameter named task_parameter_1 with the value 'Test Run'.
      • Task_A_2_Notebook: It sets another task parameter named task_parameter_2 with the value 'Test Run 2'.
  2. Job_B: This is another job that consists of job-run tasks that contain notebooks, and I want to use the parameters set by Job_A's tasks.

    • Task_B (Type: Run Job): This task needs to access the task values set in Job_Aโ€™s notebooks.
      • Task_B_1_Notebook: It needs to retrieve the value of task_parameter_1 set by Task_A_1_Notebook.

I've attempted to reference the parameters in Task_B using the syntax {{tasks.Task_A.values.task_parameter_1}}, but I can't seem to directly reference notebooks within this structure, only the job task names.

How can I correctly pass and access these task parameters between different "Run Job" tasks, especially when the actual parameter setting occurs in notebooks within these jobs?


2 REPLIES 2

Kaniz
Community Manager
Community Manager

Hi, In Databricks, you can use task values to pass arbitrary parameters between tasks in a job. This allows you to share information across different tasks, even when those tasks are configured as โ€œRun Jobโ€ tasks containing notebooks.

Hereโ€™s how you can correctly pass and access task parameters between different โ€œRun Jobโ€ tasks:

  1. Setting Task Values in Notebooks (Job_A):

    • In your Task_A_1_Notebook and Task_A_2_Notebook, use the following commands to set task values:
      dbutils.jobs.taskValues.set(key='task_parameter_1', value='Test Run')
      dbutils.jobs.taskValues.set(key='task_parameter_2', value='Test Run 2')
      
      • Replace 'Test Run' and 'Test Run 2' with the actual parameter values you want to pass.
  2. Retrieving Task Values in Notebooks (Job_B):

    • In your Task_B_1_Notebook (within Job_B), retrieve the values set by Task_A_1_Notebook using the following commands:
      task_parameter_1_value = dbutils.jobs.taskValues.get(taskKey='Task_A', key='task_parameter_1', default=None)
      
      • Replace 'Task_A' with the name of the job task (Task_A) that set the value.
      • The default=None argument specifies a default value to return if the key is not found.
  3. Using Dynamic Value References:

    • You can now use dynamic value references directly in your notebooks to reference task values set in upstream tasks. For example:
      {{tasks.Task_A.values.task_parameter_1}}
      
      • This allows you to reference the value of task_parameter_1 set by Task_A_1_Notebook.

By following these steps, youโ€™ll be able to pass and access task parameters between different โ€œRun Jobโ€ tasks, even when the parameter setting occurs within notebooks. Remember to replace the placeholders with your actual task names and parameter values. Happy orchestrating! ๐Ÿš€๐Ÿ“Š

For more details, refer to the official Databricks documentation1.

 

zaheerabbas
New Contributor II

Thanks, @Kaniz, I have tried the above approach by setting values in the notebooks within the `Job Run` type tasks. But when retrieving them - the notebook runs into errors saying the task name is not defined in the workflow. 

The above approach of setting task values and dynamic reference values when all tasks are of type `Job Run` does not work and runs into errors. From this discussion I posted in my original query, I have the same problem, but instead of the starting task as a notebook, I have all my tasks as `Job Run` type.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.