cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Databricks Workflow design

ashraf1395
Valued Contributor II

I have 7 - 8 different dlt pipelines which have to be run at the same time according to their batch type i.e. hourly and daily. Right now they are triggered effectively according to their batch type. 

I want to move to a next stage where I want to club all these dlt pipelines together in a single workflow. I can simply create one task for each dlt pipeline. But I am not able to figure our how to schedule it daily and hourly.

Let's say I will schedule the workflow hourly and and according to some set of conditions whether daily / hourly. I can get that from metadata table. I can trigger only specific tasks.  But I am not sure how to do it.

 

What will be the right approach for it. If I want to keep all the dlt pipelines in a single workflow where some are of daily schedule and some hourly. And these schedules are not fixed in databricks they are triggered from outside according to the time.

3 REPLIES 3

ashraf1395
Valued Contributor II

Hi there, Thanks I understood the approach. But how to implement is what I am not able to figure out.
If can give a small demo example. It will be really helpful

VZLA
Databricks Employee
Databricks Employee

Hello, thank you for your question!

Here’s a general approach to achieve this, but please let us know if the requirement understanding does not align:

  1. Create a Parent Workflow with a Single Scheduled Trigger:

    • Schedule the workflow to run hourly since that is the more frequent batch type.
    • Use a master task that queries the metadata table to determine which DLT pipelines should run in that execution.
  2. Use a Conditional Execution Mechanism:

    • Add a notebook task as the first step in the workflow that:
      • Reads the metadata table (which contains schedule information).
      • Determines if the run is hourly or daily based on the current timestamp.
      • Sets workflow variables or dbutils.jobs.taskValues() for downstream task execution.
  3. Configure Dynamic Task Execution:

    • Define one task per DLT pipeline in the workflow.
    • Use conditional execution (Run if condition is met) to ensure that:
      • Hourly pipelines run on every execution.
      • Daily pipelines run only when the master task determines it's a daily run
  4. Use dbutils.jobs.taskValues() to Control Execution:

    • In your master task, set a value like: dbutils.jobs.taskValues.set("run_daily_pipelines", "true")

  5. Then, configure each pipeline task with Depends on the master task and set execution conditions based on the variable.

Alternative Approach: Two Separate Workflows

If this is still not having enough flexible conditional execution for your needs, consider:

  • A daily workflow (triggered once per day).
  • An hourly workflow (triggered every hour).
  • Both workflows query the metadata table and only trigger relevant DLT pipelines.

Please let me know if you're question was meant to be more specifically addressed, and/or if the above needs further clarification. In the meantime, hope it helps!

ashraf1395
Valued Contributor II

Hi @VZLA , I got the idea. There will be a small change in the way, we will use it. Since we don't schedule the workflow in databricks we trigger it using the API. So I will pass a job parameter along with the trigger according to the timestamp whether it is a daily or hourly run and then inside the workflow I will handle it. 

Got the idea. Will circle back. If any help will be required

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group