cancel
Showing results for 
Search instead for 
Did you mean: 
Community Platform Discussions
Connect with fellow community members to discuss general topics related to the Databricks platform, industry trends, and best practices. Share experiences, ask questions, and foster collaboration within the community.
cancel
Showing results for 
Search instead for 
Did you mean: 

Databricks workflow with sequenced tasks

h2p5cq8
New Contributor II

I have a continuous workflow. It is continuous because I would like it to run every minute and if it has stuff to do the first task will take several minutes. As I understand, continuous workflows won't requeue while a job is currently running, whereas scheduled/periodic workflows will be queued if a previous job is running. My problem is that I need to add a second task in the workflow that runs after the first task. When I try to create the second task I get an error saying "having dependencies inside a task is not allowed in continuous job." The dependency here is that the second task comes after the first task. How can I create a continuous workflow with multiple, sequenced tasks? If this can't be done, how can I have a frequently scheduled workflow that does not queue a second job while a previous job is still running? Thank you!

2 ACCEPTED SOLUTIONS

Accepted Solutions

Alberto_Umana
Databricks Employee
Databricks Employee

Hi @h2p5cq8,

 
Is it possible for you to Instead of a continuous workflow, you can use a scheduled workflow that runs every minute. To prevent multiple instances from running simultaneously, you can implement concurrency control:
1. Set the workflow to run every minute using a cron expression like `* * * * *`.
2. At the beginning of your workflow, add a check to see if a previous instance is still running.
 
Or add dependency tasks in your workflow.

View solution in original post

Alberto_Umana
Databricks Employee
Databricks Employee

Hi @h2p5cq8,

No problem! and you can have the queue option disabled to stop it. Go to the Advanced settings in the Job details side panel and toggle off the Queue option to prevent jobs from being queued

View solution in original post

5 REPLIES 5

Alberto_Umana
Databricks Employee
Databricks Employee

Hi @h2p5cq8,

Unfortunately, continuous workflows typically don’t support dependencies between tasks, as they are designed to run continuously without a defined start and end. Let me check for additional approaches. 

Alberto_Umana
Databricks Employee
Databricks Employee

Hi @h2p5cq8,

 
Is it possible for you to Instead of a continuous workflow, you can use a scheduled workflow that runs every minute. To prevent multiple instances from running simultaneously, you can implement concurrency control:
1. Set the workflow to run every minute using a cron expression like `* * * * *`.
2. At the beginning of your workflow, add a check to see if a previous instance is still running.
 
Or add dependency tasks in your workflow.

Thank you @Alberto_Umana. I am happy to make it a scheduled workflow that runs every minute. I have set concurrency to 1, but will that keep subsequent jobs from queueing up? If not, is there some other way to stop Databricks from queueing a subsequent job while a previous job is running? I don't think it's absolutely required; I am just trying to avoid a scenario where a job becomes long running and the job queue starts filling up.

Alberto_Umana
Databricks Employee
Databricks Employee

Hi @h2p5cq8,

No problem! and you can have the queue option disabled to stop it. Go to the Advanced settings in the Job details side panel and toggle off the Queue option to prevent jobs from being queued

Perfect! Thank you for your help @Alberto_Umana 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group