Thousands of Databricks customers use Databricks Workflows every day to orchestrate business-critical workloads on the Databricks Lakehouse Platform. A great way to simplify those critical workloads is through modular orchestration.
This is now possible through our new task type, Run Job, which allows Workflows users to call a previously defined job as a task.
Why modular orchestrations?
Modular orchestrations allow for splitting a DAG up by organizational boundaries, enabling different teams in an organization to work together on different parts of a workflow. Child job ownership across different teams extends to testing and updates, making the parent workflows more reliable.
Modular orchestrations also offer reusability. When several workflows have common steps, it makes sense to define those steps in a job once and then reuse that as a child job in different parent workflows. By using parameters, reused tasks can be made more flexible to fit the needs of different parent workflows. Reusing jobs reduces the maintenance burden of workflows, ensures updates and bug fixes occur in one place and simplifies complex workflows.
How to get started
1. Get started by selecting the new task type, Run Job, which allows Workflows users to call a previously defined job as a task.
2. To search for the job to run, start typing the job name in the Job menu.
Things to consider
You should not create jobs with circular dependencies when using the Run Job task or jobs that nest more than three Run Job tasks. Circular dependencies are Run Job tasks that directly or indirectly trigger each other. For example, Job A triggers Job B, and Job B triggers Job A.
Databricks does not support jobs with circular dependencies or that nest more than three Run Job tasks and might not allow running these jobs in future releases.
Resources