cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

run_if dependencies configuration within YAML

yvesbeutler
New Contributor III

Hi guys

I have a workflow with various python wheel tasks and one job task to call another workflow. How can I prevent my original workflow from getting an unsuccessful state if the second workflow fails? These workflows are independent and shouldn't affect the outcome of each other. We only chain them to speed up the total run time.

dbx-issue.png

 

I tried to declare both previous tasks (wheel and job task) as depends_on in the configuration of the final task, but then I can't find the correct property for the run_if to set it to something like "at least one success".
We are using asset bundles and configure our workflow by using a YAML file. Unfortunately I only found some piece of documentation for JSON configurations and mostly UI approaches on how to solve it.

So, how do I find the correct values for configuring the run_if option?

1 ACCEPTED SOLUTION

Accepted Solutions

eniwoke
Contributor

Hi @yvesbeutler here is a sample way I did  it using databricks asset bundles for notebook tasks

resources:
  jobs:
    chained_jobs:
      name: chained-jobs
      tasks:
        - task_key: main
          notebook_task:
            notebook_path: /Workspace/Users/sample/Deploying Data Assets
              Bundle with VSCode Add-in/pass_task_2
            source: WORKSPACE
        - task_key: failing_task
          depends_on:
            - task_key: main
          notebook_task:
            notebook_path: /Workspace/Users/sample/Deploying Data Assets
              Bundle with VSCode Add-in/fail_task_1
            source: WORKSPACE
        - task_key: pass_task-1
          depends_on:
            - task_key: main
          notebook_task:
            notebook_path: /Workspace/Users/sample/Deploying Data Assets
              Bundle with VSCode Add-in/pass_task_1
            source: WORKSPACE
        - task_key: pass_task-2
          depends_on:
            - task_key: pass_task-1
            - task_key: failing_task
          run_if: AT_LEAST_ONE_SUCCESS
          notebook_task:
            notebook_path: /Workspace/Users/sample/Deploying Data Assets
              Bundle with VSCode Add-in/pass_task_2
            source: WORKSPACE
      queue:
        enabled: true
      performance_target: STANDARD​

 It should still be similar to your use case, Kindly tell me if it works for you 🙂 

eniwoke_0-1753110787807.png

 

Eni

View solution in original post

2 REPLIES 2

Advika
Databricks Employee
Databricks Employee

Hello @yvesbeutler!

You can set run_if: AT_LEAST_ONE_SUCCESS on the final task in your YAML configuration to ensure your workflow doesn't fail just because a dependent job fails. This way, the final task and the overall job will only fail if all its dependencies fail.

eniwoke
Contributor

Hi @yvesbeutler here is a sample way I did  it using databricks asset bundles for notebook tasks

resources:
  jobs:
    chained_jobs:
      name: chained-jobs
      tasks:
        - task_key: main
          notebook_task:
            notebook_path: /Workspace/Users/sample/Deploying Data Assets
              Bundle with VSCode Add-in/pass_task_2
            source: WORKSPACE
        - task_key: failing_task
          depends_on:
            - task_key: main
          notebook_task:
            notebook_path: /Workspace/Users/sample/Deploying Data Assets
              Bundle with VSCode Add-in/fail_task_1
            source: WORKSPACE
        - task_key: pass_task-1
          depends_on:
            - task_key: main
          notebook_task:
            notebook_path: /Workspace/Users/sample/Deploying Data Assets
              Bundle with VSCode Add-in/pass_task_1
            source: WORKSPACE
        - task_key: pass_task-2
          depends_on:
            - task_key: pass_task-1
            - task_key: failing_task
          run_if: AT_LEAST_ONE_SUCCESS
          notebook_task:
            notebook_path: /Workspace/Users/sample/Deploying Data Assets
              Bundle with VSCode Add-in/pass_task_2
            source: WORKSPACE
      queue:
        enabled: true
      performance_target: STANDARD​

 It should still be similar to your use case, Kindly tell me if it works for you 🙂 

eniwoke_0-1753110787807.png

 

Eni

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now