01-29-2025 03:06 AM
I have 7 - 8 different dlt pipelines which have to be run at the same time according to their batch type i.e. hourly and daily. Right now they are triggered effectively according to their batch type.
I want to move to a next stage where I want to club all these dlt pipelines together in a single workflow. I can simply create one task for each dlt pipeline. But I am not able to figure our how to schedule it daily and hourly.
Let's say I will schedule the workflow hourly and and according to some set of conditions whether daily / hourly. I can get that from metadata table. I can trigger only specific tasks. But I am not sure how to do it.
What will be the right approach for it. If I want to keep all the dlt pipelines in a single workflow where some are of daily schedule and some hourly. And these schedules are not fixed in databricks they are triggered from outside according to the time.
01-29-2025 03:41 AM
Hello, thank you for your question!
Here’s a general approach to achieve this, but please let us know if the requirement understanding does not align:
Create a Parent Workflow with a Single Scheduled Trigger:
Use a Conditional Execution Mechanism:
dbutils.jobs.taskValues()
for downstream task execution.Configure Dynamic Task Execution:
Use dbutils.jobs.taskValues()
to Control Execution:
In your master task, set a value like: dbutils.jobs.taskValues.set("run_daily_pipelines", "true")
Then, configure each pipeline task with Depends on
the master task and set execution conditions based on the variable.
If this is still not having enough flexible conditional execution for your needs, consider:
Please let me know if you're question was meant to be more specifically addressed, and/or if the above needs further clarification. In the meantime, hope it helps!
01-29-2025 03:29 AM
Hi there, Thanks I understood the approach. But how to implement is what I am not able to figure out.
If can give a small demo example. It will be really helpful
01-29-2025 03:41 AM
Hello, thank you for your question!
Here’s a general approach to achieve this, but please let us know if the requirement understanding does not align:
Create a Parent Workflow with a Single Scheduled Trigger:
Use a Conditional Execution Mechanism:
dbutils.jobs.taskValues()
for downstream task execution.Configure Dynamic Task Execution:
Use dbutils.jobs.taskValues()
to Control Execution:
In your master task, set a value like: dbutils.jobs.taskValues.set("run_daily_pipelines", "true")
Then, configure each pipeline task with Depends on
the master task and set execution conditions based on the variable.
If this is still not having enough flexible conditional execution for your needs, consider:
Please let me know if you're question was meant to be more specifically addressed, and/or if the above needs further clarification. In the meantime, hope it helps!
01-29-2025 03:53 AM
Hi @VZLA , I got the idea. There will be a small change in the way, we will use it. Since we don't schedule the workflow in databricks we trigger it using the API. So I will pass a job parameter along with the trigger according to the timestamp whether it is a daily or hourly run and then inside the workflow I will handle it.
Got the idea. Will circle back. If any help will be required
Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!
Sign Up Now