cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Run continuous job for a period of time

theanhdo
New Contributor III

Hi there,

I have a job where the Trigger type is configured as Continuous. I want to only run the Continuous job for a period of time per day, e.g. 8AM - 5PM. I understand that we can achieve it by manually starting and cancelling the job on the UI, or by programmatically starting and cancelling the job using these APIs
https://<databricks-instance>/api/2.1/jobs/run-now
https://<databricks-instance>/api/2.1/jobs/runs/cancel

However, I would like to ask if there is any job setting, e.g. using cron syntax, to achieve this?

 

3 REPLIES 3

MuthuLakshmi
Databricks Employee
Databricks Employee

@theanhdo 

To schedule a job to run at 8 AM every day, you should use the Scheduled trigger type rather than the Continuous trigger type. The Continuous trigger type is designed to keep a job running continuously, which is not suitable for running a job at a specific time each day.

Hereโ€™s how you can schedule a job to run at 8 AM every day using the Scheduled trigger type:

  1. Navigate to Workflows:
    • In the Databricks workspace, go to the sidebar and click on Workflows.
  2. Select the Job:
    • Click the job name in the Name column on the Jobs tab.
  3. Add a Trigger:
    • In the Job details panel, click Add trigger.
  4. Configure the Trigger:
    • In the Trigger type dropdown, select Scheduled.
    • In the Schedule type dropdown, select Advanced.
  5. Set the Schedule:

Use the following cron expression to schedule the job to run at 8 AM every day:

0 8 * * *

Optionally, select the Show Cron Syntax checkbox to display and edit the schedule using Quartz Cron Syntax.

  1. Save the Configuration:
    • Click Save to apply the schedule.

This configuration will ensure that your job runs at 8 AM every day.

To stop the job, you have to use rest API

You will need to create a separate job or task that stops the main job at 5 PM. This can be done using the Databricks REST API to cancel the job run.

Create a new job that uses the REST API to cancel the main job run.

Add a trigger to this new job with the following cron expression to run at 5 PM every da

theanhdo
New Contributor III

Hi @MuthuLakshmi , thank you for your answer. However, your answer doesn't help with my question. Let me rephrase my question.

In short, my question is how to configure a Continuous job to run for a period of time, e.g. from 8AM to 5PM every day, and automatically stop in other time of the day?

In details, I have a job that is running Continuously from 8AM to 5PM every day, and in other time of the day I want to stop it. The job is configured with the Trigger set as Continuous, however there is no option to configure the running period. I understand that we can achieve it by manually starting and cancelling the job on the UI, or by programmatically starting and cancelling the job using these APIs
https://<databricks-instance>/api/2.1/jobs/run-now
https://<databricks-instance>/api/2.1/jobs/runs/cancel

However, I would like to ask if there is any job setting, e.g. using cron syntax, to achieve this?

eslaats
New Contributor II

There doesn't seem to be a proper way to do this currently.

We ended up running the job a couple of times in order to figure out some upper bound for run time, and just using that in the cron. Some jobs now run every 5 minutes during office hours, which is close enough for our usecase.

This does cause issues with Skipped runs when compute is slow to spin up, so make sure you adjust any notifications accordingly.

Alternatively, one could apply a Continuous schedule to the job, then toggle the schedule state for that job to ACTIVE at the start and to PAUSED at the end of the day using the Databricks API. We added two jobs in Databricks that call the API to toggle this state. Do test this thoroughly, or you'll have some costs waiting for you by Monday morning ๐Ÿ™‚

This all feels very hacky for functionality that feels like it should be supported by default.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group