<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Robust/complex scheduling with dependency within Databricks? in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/robust-complex-scheduling-with-dependency-within-databricks/m-p/144952#M52423</link>
    <description>&lt;P&gt;Robust scheduling with dependency within Databricks?&lt;/P&gt;&lt;P&gt;======================================&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks for reviewing my threads. I like to explore Robust/complex scheduling with dependency within Databricks.&lt;/P&gt;&lt;P&gt;We know traditional scheduling framework allow robust dependency/conditions setting across multiple tiers etc.&lt;/P&gt;&lt;P&gt;How can we do that within Databricks scheduling?&lt;/P&gt;&lt;P&gt;Eg:&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;We have a HR application in&amp;nbsp; tier1 - 100 jobs Start time 12AM&lt;/P&gt;&lt;P&gt;We have a Finance applications in tier2 - 125 jobs Start time 10AM + completion of&amp;nbsp; HR applications (100 jobs)&lt;/P&gt;&lt;P&gt;These can be run daily or weekly.&lt;/P&gt;&lt;P&gt;How do we do this?&lt;/P&gt;&lt;P&gt;Are there any doc/whitepapers on this subject?&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;Thanks for your insights.&lt;/P&gt;</description>
    <pubDate>Thu, 22 Jan 2026 23:34:33 GMT</pubDate>
    <dc:creator>RIDBX</dc:creator>
    <dc:date>2026-01-22T23:34:33Z</dc:date>
    <item>
      <title>Robust/complex scheduling with dependency within Databricks?</title>
      <link>https://community.databricks.com/t5/data-engineering/robust-complex-scheduling-with-dependency-within-databricks/m-p/144952#M52423</link>
      <description>&lt;P&gt;Robust scheduling with dependency within Databricks?&lt;/P&gt;&lt;P&gt;======================================&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks for reviewing my threads. I like to explore Robust/complex scheduling with dependency within Databricks.&lt;/P&gt;&lt;P&gt;We know traditional scheduling framework allow robust dependency/conditions setting across multiple tiers etc.&lt;/P&gt;&lt;P&gt;How can we do that within Databricks scheduling?&lt;/P&gt;&lt;P&gt;Eg:&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;We have a HR application in&amp;nbsp; tier1 - 100 jobs Start time 12AM&lt;/P&gt;&lt;P&gt;We have a Finance applications in tier2 - 125 jobs Start time 10AM + completion of&amp;nbsp; HR applications (100 jobs)&lt;/P&gt;&lt;P&gt;These can be run daily or weekly.&lt;/P&gt;&lt;P&gt;How do we do this?&lt;/P&gt;&lt;P&gt;Are there any doc/whitepapers on this subject?&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;Thanks for your insights.&lt;/P&gt;</description>
      <pubDate>Thu, 22 Jan 2026 23:34:33 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/robust-complex-scheduling-with-dependency-within-databricks/m-p/144952#M52423</guid>
      <dc:creator>RIDBX</dc:creator>
      <dc:date>2026-01-22T23:34:33Z</dc:date>
    </item>
    <item>
      <title>Re: Robust/complex scheduling with dependency within Databricks?</title>
      <link>https://community.databricks.com/t5/data-engineering/robust-complex-scheduling-with-dependency-within-databricks/m-p/144959#M52426</link>
      <description>&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;Intresting Scenario . This is what i think you can do &lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;Tier 1 job: A single job that contains 100 tasks( where each task trigger a job) scheduled to run at 12:00 AM.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;Tier 2 job: A single job that contains 125 tasks, scheduled to run at 10:00 AM.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;All 125 tasks in Tier 2 depend on a single SQL Alert task that runs a SQL query against system tables to determine whether the Tier 1 job has completed. As soon as the job completes the SQL Alert task should dependecy would be fullfilled and the job would be ready to run as 10 am when its scheduled to run . &lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;You dont need 100 or 125 task to run each job . You can simply them design by using for each loop and reading parameters from a json for each jobs and if else conditions&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 23 Jan 2026 02:50:27 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/robust-complex-scheduling-with-dependency-within-databricks/m-p/144959#M52426</guid>
      <dc:creator>pradeep_singh</dc:creator>
      <dc:date>2026-01-23T02:50:27Z</dc:date>
    </item>
    <item>
      <title>Re: Robust/complex scheduling with dependency within Databricks?</title>
      <link>https://community.databricks.com/t5/data-engineering/robust-complex-scheduling-with-dependency-within-databricks/m-p/144960#M52427</link>
      <description>&lt;P&gt;Further readings -&amp;nbsp;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;SQL Altert Task -&amp;nbsp;&lt;A href="https://docs.databricks.com/aws/en/jobs/sql" target="_blank"&gt;https://docs.databricks.com/aws/en/jobs/sql&lt;/A&gt;&lt;/LI&gt;&lt;LI&gt;If else Task -&amp;nbsp;&lt;A href="https://docs.databricks.com/aws/en/jobs/if-else" target="_blank"&gt;https://docs.databricks.com/aws/en/jobs/if-else&lt;/A&gt;&lt;/LI&gt;&lt;LI&gt;For Each task -&amp;nbsp;&lt;A href="https://docs.databricks.com/aws/en/jobs/for-each" target="_blank"&gt;https://docs.databricks.com/aws/en/jobs/for-each&lt;/A&gt;&lt;/LI&gt;&lt;LI&gt;Run job task -&amp;nbsp;&lt;A href="https://docs.databricks.com/aws/en/jobs/run-job" target="_blank"&gt;https://docs.databricks.com/aws/en/jobs/run-job&lt;/A&gt;&lt;/LI&gt;&lt;LI&gt;Configure task dependencies -&amp;nbsp;&lt;A href="https://docs.databricks.com/aws/en/jobs/run-if" target="_blank"&gt;https://docs.databricks.com/aws/en/jobs/run-if&lt;/A&gt;&amp;nbsp;&lt;/LI&gt;&lt;/UL&gt;</description>
      <pubDate>Fri, 23 Jan 2026 02:54:19 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/robust-complex-scheduling-with-dependency-within-databricks/m-p/144960#M52427</guid>
      <dc:creator>pradeep_singh</dc:creator>
      <dc:date>2026-01-23T02:54:19Z</dc:date>
    </item>
    <item>
      <title>Hi @RIDBX, Databricks Lakeflow Jobs has several features...</title>
      <link>https://community.databricks.com/t5/data-engineering/robust-complex-scheduling-with-dependency-within-databricks/m-p/150292#M53337</link>
      <description>&lt;P&gt;Hi &lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/103045"&gt;@RIDBX&lt;/a&gt;,&lt;/P&gt;
&lt;P&gt;Databricks Lakeflow Jobs has several features that let you build exactly this kind of tiered, dependency-driven orchestration natively. Here is how I would approach your HR (Tier 1) and Finance (Tier 2) scenario.&lt;/P&gt;
&lt;P&gt;OPTION 1: SINGLE ORCHESTRATOR JOB WITH RUN JOB TASKS&lt;/P&gt;
&lt;P&gt;The cleanest approach is to create one top-level orchestrator job that coordinates everything. Databricks supports a "Run Job" task type that lets a task inside one job trigger and wait for another job to complete before downstream tasks proceed.&lt;/P&gt;
&lt;P&gt;Your design would look like this:&lt;/P&gt;
&lt;PRE&gt;Orchestrator Job (scheduled daily at 12 AM)
|
|-- [HR_Job_1]  (Run Job task -&amp;gt; triggers HR Job 1)
|-- [HR_Job_2]  (Run Job task -&amp;gt; triggers HR Job 2)
|-- ...
|-- [HR_Job_100] (Run Job task -&amp;gt; triggers HR Job 100)
|
|-- (all 100 HR tasks must succeed)
|
|-- [Finance_Job_1]   (Run Job task -&amp;gt; triggers Finance Job 1)
|-- [Finance_Job_2]   (Run Job task -&amp;gt; triggers Finance Job 2)
|-- ...
|-- [Finance_Job_125] (Run Job task -&amp;gt; triggers Finance Job 125)&lt;/PRE&gt;
&lt;P&gt;Each of the 100 HR tasks is a "Run Job" task that triggers its respective standalone HR job. The 125 Finance tasks are also "Run Job" tasks, and each one is configured to depend on ALL 100 HR tasks completing successfully. This is done through the task dependency graph (DAG) in the Jobs UI.&lt;/P&gt;
&lt;P&gt;To set up a Run Job task:&lt;BR /&gt;
1. In your orchestrator job, click "Add task"&lt;BR /&gt;
2. Set the Type to "Run Job"&lt;BR /&gt;
3. Select the target job from the dropdown&lt;BR /&gt;
4. Set the dependencies to the upstream tasks that must complete first&lt;/P&gt;
&lt;P&gt;A single Databricks job supports up to 1,000 tasks, so your 225 total tasks (100 + 125) fits well within that limit.&lt;/P&gt;
&lt;P&gt;Documentation: &lt;A href="https://docs.databricks.com/aws/en/jobs/run-job" target="_blank"&gt;https://docs.databricks.com/aws/en/jobs/run-job&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;OPTION 2: USE FOR EACH TASKS TO SIMPLIFY&lt;/P&gt;
&lt;P&gt;If your HR and Finance jobs follow similar patterns and can be parameterized, you can use the "For Each" task to dramatically simplify the orchestrator. Instead of defining 100 individual Run Job tasks for HR, you define one For Each task that iterates over a list of job configurations.&lt;/P&gt;
&lt;PRE&gt;Orchestrator Job (scheduled daily at 12 AM)
|
|-- [HR_ForEach] (For Each task, iterates over 100 HR job configs)
|       |-- nested task: Run Job (parameterized)
|
|-- (HR_ForEach must succeed)
|
|-- [Finance_ForEach] (For Each task, iterates over 125 Finance job configs)
        |-- nested task: Run Job (parameterized)&lt;/PRE&gt;
&lt;P&gt;You can pass in a JSON array of parameters (job IDs, config values, etc.) and set a concurrency level to control how many run in parallel. The Finance For Each task depends on the HR For Each task, so it only starts after all HR iterations complete.&lt;/P&gt;
&lt;P&gt;Documentation: &lt;A href="https://docs.databricks.com/aws/en/jobs/for-each" target="_blank"&gt;https://docs.databricks.com/aws/en/jobs/for-each&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;OPTION 3: TWO SEPARATE JOBS WITH TABLE-UPDATE OR CONTINUOUS TRIGGERS&lt;/P&gt;
&lt;P&gt;If you prefer to keep Tier 1 and Tier 2 as separate jobs, you can use trigger-based coordination:&lt;/P&gt;
&lt;P&gt;1. Schedule the HR job (Tier 1) at 12 AM.&lt;BR /&gt;
2. Configure the Finance job (Tier 2) with a table-update trigger or use a webhook/API-based approach:&lt;/P&gt;
&lt;PRE&gt; - Have the last task in the HR job write a "completion marker" to a Delta table.
 - Configure the Finance job with a table-update trigger that monitors that marker table.
 - The Finance job will automatically start when the marker table is updated.&lt;/PRE&gt;
&lt;P&gt;Alternatively, the last task in the HR job can call the Databricks Jobs API (POST /api/2.1/jobs/run-now) to programmatically trigger the Finance job.&lt;/P&gt;
&lt;P&gt;Documentation: &lt;A href="https://docs.databricks.com/aws/en/jobs/triggers" target="_blank"&gt;https://docs.databricks.com/aws/en/jobs/triggers&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;CONDITIONAL EXECUTION WITH RUN IF&lt;/P&gt;
&lt;P&gt;For more granular control, each downstream task can use "Run if dependencies" to specify conditions such as:&lt;BR /&gt;
- "All succeeded" (default): run only if every upstream dependency succeeded&lt;BR /&gt;
- "At least one succeeded": run if at least one upstream task succeeded&lt;BR /&gt;
- "None failed": run if no upstream tasks failed&lt;BR /&gt;
- "All done": run regardless of upstream success/failure (useful for cleanup)&lt;/P&gt;
&lt;P&gt;This is configured per task in the dependency settings.&lt;/P&gt;
&lt;P&gt;Documentation: &lt;A href="https://docs.databricks.com/aws/en/jobs/run-if" target="_blank"&gt;https://docs.databricks.com/aws/en/jobs/run-if&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;IF/ELSE BRANCHING&lt;/P&gt;
&lt;P&gt;If you need conditional logic (for example, skip Tier 2 entirely if a certain condition is met), the If/Else condition task evaluates expressions using task values, job parameters, or dynamic references. For example, you could check whether a quality gate passed before proceeding.&lt;/P&gt;
&lt;P&gt;Documentation: &lt;A href="https://docs.databricks.com/aws/en/jobs/if-else" target="_blank"&gt;https://docs.databricks.com/aws/en/jobs/if-else&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;MY RECOMMENDATION FOR YOUR SCENARIO&lt;/P&gt;
&lt;P&gt;For 100 HR jobs followed by 125 Finance jobs with a hard dependency, I would recommend Option 1 or Option 2:&lt;/P&gt;
&lt;P&gt;- If your jobs are diverse with different configurations, use Option 1 with individual Run Job tasks in a single orchestrator.&lt;BR /&gt;
- If your jobs can be parameterized into a common template, use Option 2 with For Each tasks for a cleaner, more maintainable design.&lt;/P&gt;
&lt;P&gt;Both approaches give you a single place to monitor the entire workflow, with a clear visual DAG showing the dependency between tiers.&lt;/P&gt;
&lt;P&gt;For additional reading on Lakeflow Jobs orchestration:&lt;BR /&gt;
&lt;A href="https://docs.databricks.com/aws/en/jobs" target="_blank"&gt;https://docs.databricks.com/aws/en/jobs&lt;/A&gt;&lt;BR /&gt;
&lt;A href="https://docs.databricks.com/aws/en/jobs/sql" target="_blank"&gt;https://docs.databricks.com/aws/en/jobs/sql&lt;/A&gt; (SQL task types including alerts)&lt;/P&gt;
&lt;P&gt;* This reply used an agent system I built to research and draft this response based on the wide set of documentation I have available and previous memory. I personally review the draft for any obvious issues and for monitoring system reliability and update it when I detect any drift, but there is still a small chance that something is inaccurate, especially if you are experimenting with brand new features.&lt;/P&gt;
&lt;P&gt;If this answer resolves your question, could you mark it as "Accept as Solution"? That helps other users quickly find the correct fix.&lt;/P&gt;</description>
      <pubDate>Mon, 09 Mar 2026 01:06:39 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/robust-complex-scheduling-with-dependency-within-databricks/m-p/150292#M53337</guid>
      <dc:creator>SteveOstrowski</dc:creator>
      <dc:date>2026-03-09T01:06:39Z</dc:date>
    </item>
  </channel>
</rss>

