<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Run continuous job for a period of time in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/run-continuous-job-for-a-period-of-time/m-p/98375#M39709</link>
    <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/89478"&gt;@MuthuLakshmi&lt;/a&gt;&amp;nbsp;, thank you for your answer. However, your answer &lt;STRONG&gt;doesn't help&lt;/STRONG&gt; with my question. Let me rephrase my question.&lt;/P&gt;&lt;P&gt;In short, my question is how to configure a&amp;nbsp;&lt;STRONG&gt;Continuous&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;job to run for a period of time, e.g. from 8AM to 5PM every day, and automatically stop in other time of the day?&lt;/P&gt;&lt;P&gt;In details,&amp;nbsp;I have a job that is running&amp;nbsp;&lt;STRONG&gt;Continuously&amp;nbsp;&lt;/STRONG&gt;from 8AM to 5PM every day, and in other time of the day I want to stop it. The job is configured with&amp;nbsp;&lt;SPAN&gt;the&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;Trigger set as&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;Continuous,&amp;nbsp;&lt;/STRONG&gt;however &lt;STRONG&gt;there is no option&lt;/STRONG&gt; to configure the running period.&amp;nbsp;I understand that we can achieve it by manually starting and cancelling the job on the UI, or by programmatically starting and cancelling the job using these APIs&lt;BR /&gt;&lt;SPAN&gt;&lt;A href="https://community.databricks.com/t5/data-engineering/run-continuous-job-for-a-period-of-time/m-p/98299/thread-id/39675" target="_blank" rel="nofollow noopener noreferrer"&gt;https://&amp;lt;databricks-instance&amp;gt;/api/2.1/jobs/run-now&lt;/A&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN&gt;&lt;A href="https://community.databricks.com/t5/data-engineering/run-continuous-job-for-a-period-of-time/m-p/98299/thread-id/39675" target="_blank" rel="nofollow noopener noreferrer"&gt;https://&amp;lt;databricks-instance&amp;gt;/api/2.1/jobs/runs/cancel&lt;/A&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;However, I would like to ask if there is any job setting, e.g. using cron syntax, to achieve this?&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Mon, 11 Nov 2024 21:22:37 GMT</pubDate>
    <dc:creator>theanhdo</dc:creator>
    <dc:date>2024-11-11T21:22:37Z</dc:date>
    <item>
      <title>Run continuous job for a period of time</title>
      <link>https://community.databricks.com/t5/data-engineering/run-continuous-job-for-a-period-of-time/m-p/98299#M39675</link>
      <description>&lt;P&gt;Hi there,&lt;/P&gt;&lt;P&gt;I have a job where the&amp;nbsp;&lt;SPAN&gt;Trigger type is configured as&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;Continuous&lt;/STRONG&gt;. I want to only run the &lt;STRONG&gt;Continuous&lt;/STRONG&gt; job for a period of time per day, e.g. 8AM - 5PM. I understand that we can achieve it by manually starting and cancelling the job on the UI, or by programmatically starting and cancelling the job using these APIs&lt;BR /&gt;&lt;SPAN&gt;&lt;A href="https://community.databricks.com/" target="_blank"&gt;https://&amp;lt;databricks-instance&amp;gt;/api/2.1/jobs/run-now&lt;/A&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN&gt;&lt;A href="https://community.databricks.com/" target="_blank"&gt;https://&amp;lt;databricks-instance&amp;gt;/api/2.1/jobs/runs/cancel&lt;/A&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;However, I would like to ask if there is any job setting, e.g. using cron syntax, to achieve this?&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 11 Nov 2024 00:16:57 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/run-continuous-job-for-a-period-of-time/m-p/98299#M39675</guid>
      <dc:creator>theanhdo</dc:creator>
      <dc:date>2024-11-11T00:16:57Z</dc:date>
    </item>
    <item>
      <title>Re: Run continuous job for a period of time</title>
      <link>https://community.databricks.com/t5/data-engineering/run-continuous-job-for-a-period-of-time/m-p/98327#M39693</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/111593"&gt;@theanhdo&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class="p1"&gt;To schedule a job to run at 8 AM every day, you should use the&amp;nbsp;&lt;STRONG&gt;Scheduled&lt;/STRONG&gt;&amp;nbsp;trigger type rather than the&amp;nbsp;&lt;STRONG&gt;Continuous&lt;/STRONG&gt;&amp;nbsp;trigger type. The&amp;nbsp;&lt;STRONG&gt;Continuous&lt;/STRONG&gt;&amp;nbsp;trigger type is designed to keep a job running continuously, which is not suitable for running a job at a specific time each day.&lt;/P&gt;
&lt;P class="p1"&gt;Here’s how you can schedule a job to run at 8 AM every day using the&amp;nbsp;&lt;STRONG&gt;Scheduled&lt;/STRONG&gt;&amp;nbsp;trigger type:&lt;/P&gt;
&lt;OL class="ol1"&gt;
&lt;LI class="li1"&gt;&lt;STRONG&gt;Navigate to Workflows&lt;/STRONG&gt;:&lt;/LI&gt;
&lt;UL class="ul1"&gt;
&lt;LI class="li1"&gt;In the Databricks workspace, go to the sidebar and click on&amp;nbsp;&lt;STRONG&gt;Workflows&lt;/STRONG&gt;.&lt;/LI&gt;
&lt;/UL&gt;
&lt;LI class="li1"&gt;&lt;STRONG&gt;Select the Job&lt;/STRONG&gt;:&lt;/LI&gt;
&lt;UL class="ul1"&gt;
&lt;LI class="li1"&gt;Click the job name in the&amp;nbsp;&lt;STRONG&gt;Name&lt;/STRONG&gt;&amp;nbsp;column on the&amp;nbsp;&lt;STRONG&gt;Jobs&lt;/STRONG&gt;&amp;nbsp;tab.&lt;/LI&gt;
&lt;/UL&gt;
&lt;LI class="li1"&gt;&lt;STRONG&gt;Add a Trigger&lt;/STRONG&gt;:&lt;/LI&gt;
&lt;UL class="ul1"&gt;
&lt;LI class="li1"&gt;In the&amp;nbsp;&lt;STRONG&gt;Job details&lt;/STRONG&gt;&amp;nbsp;panel, click&amp;nbsp;&lt;STRONG&gt;Add trigger&lt;/STRONG&gt;.&lt;/LI&gt;
&lt;/UL&gt;
&lt;LI class="li1"&gt;&lt;STRONG&gt;Configure the Trigger&lt;/STRONG&gt;:&lt;/LI&gt;
&lt;UL class="ul1"&gt;
&lt;LI class="li1"&gt;In the&amp;nbsp;&lt;STRONG&gt;Trigger type&lt;/STRONG&gt;&amp;nbsp;dropdown, select&amp;nbsp;&lt;STRONG&gt;Scheduled&lt;/STRONG&gt;.&lt;/LI&gt;
&lt;LI class="li1"&gt;In the&amp;nbsp;&lt;STRONG&gt;Schedule type&lt;/STRONG&gt;&amp;nbsp;dropdown, select&amp;nbsp;&lt;STRONG&gt;Advanced&lt;/STRONG&gt;.&lt;/LI&gt;
&lt;/UL&gt;
&lt;LI class="li1"&gt;&lt;STRONG&gt;Set the Schedule&lt;/STRONG&gt;:&lt;/LI&gt;
&lt;/OL&gt;
&lt;P class="p1"&gt;Use the following cron expression to schedule the job to run at 8 AM every day:&lt;/P&gt;
&lt;P class="p1"&gt;0&amp;nbsp;8&amp;nbsp;*&amp;nbsp;*&amp;nbsp;*&lt;/P&gt;
&lt;P class="p1"&gt;Optionally, select the&amp;nbsp;&lt;STRONG&gt;Show Cron Syntax&lt;/STRONG&gt;&amp;nbsp;checkbox to display and edit the schedule using Quartz Cron Syntax.&lt;/P&gt;
&lt;OL class="ol1"&gt;
&lt;LI class="li1"&gt;&lt;STRONG&gt;Save the Configuration&lt;/STRONG&gt;:&lt;/LI&gt;
&lt;UL class="ul1"&gt;
&lt;LI class="li1"&gt;Click&amp;nbsp;&lt;STRONG&gt;Save&lt;/STRONG&gt;&amp;nbsp;to apply the schedule.&lt;/LI&gt;
&lt;/UL&gt;
&lt;/OL&gt;
&lt;P class="p1"&gt;This configuration will ensure that your job runs at 8 AM every day.&lt;/P&gt;
&lt;P class="p1"&gt;&lt;STRONG&gt;To stop the job, you have to use rest API&lt;/STRONG&gt;&lt;/P&gt;
&lt;P class="p1"&gt;You will need to create a separate job or task that stops the main job at 5 PM. This can be done using the Databricks REST API to cancel the job run.&lt;/P&gt;
&lt;P class="p1"&gt;Create a new job that uses the REST API to cancel the main job run.&lt;/P&gt;
&lt;P class="p3"&gt;Add a trigger to this new job with the following cron expression to run at 5 PM every da&lt;/P&gt;</description>
      <pubDate>Mon, 11 Nov 2024 12:24:11 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/run-continuous-job-for-a-period-of-time/m-p/98327#M39693</guid>
      <dc:creator>MuthuLakshmi</dc:creator>
      <dc:date>2024-11-11T12:24:11Z</dc:date>
    </item>
    <item>
      <title>Re: Run continuous job for a period of time</title>
      <link>https://community.databricks.com/t5/data-engineering/run-continuous-job-for-a-period-of-time/m-p/98375#M39709</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/89478"&gt;@MuthuLakshmi&lt;/a&gt;&amp;nbsp;, thank you for your answer. However, your answer &lt;STRONG&gt;doesn't help&lt;/STRONG&gt; with my question. Let me rephrase my question.&lt;/P&gt;&lt;P&gt;In short, my question is how to configure a&amp;nbsp;&lt;STRONG&gt;Continuous&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;job to run for a period of time, e.g. from 8AM to 5PM every day, and automatically stop in other time of the day?&lt;/P&gt;&lt;P&gt;In details,&amp;nbsp;I have a job that is running&amp;nbsp;&lt;STRONG&gt;Continuously&amp;nbsp;&lt;/STRONG&gt;from 8AM to 5PM every day, and in other time of the day I want to stop it. The job is configured with&amp;nbsp;&lt;SPAN&gt;the&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;Trigger set as&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;Continuous,&amp;nbsp;&lt;/STRONG&gt;however &lt;STRONG&gt;there is no option&lt;/STRONG&gt; to configure the running period.&amp;nbsp;I understand that we can achieve it by manually starting and cancelling the job on the UI, or by programmatically starting and cancelling the job using these APIs&lt;BR /&gt;&lt;SPAN&gt;&lt;A href="https://community.databricks.com/t5/data-engineering/run-continuous-job-for-a-period-of-time/m-p/98299/thread-id/39675" target="_blank" rel="nofollow noopener noreferrer"&gt;https://&amp;lt;databricks-instance&amp;gt;/api/2.1/jobs/run-now&lt;/A&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN&gt;&lt;A href="https://community.databricks.com/t5/data-engineering/run-continuous-job-for-a-period-of-time/m-p/98299/thread-id/39675" target="_blank" rel="nofollow noopener noreferrer"&gt;https://&amp;lt;databricks-instance&amp;gt;/api/2.1/jobs/runs/cancel&lt;/A&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;However, I would like to ask if there is any job setting, e.g. using cron syntax, to achieve this?&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 11 Nov 2024 21:22:37 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/run-continuous-job-for-a-period-of-time/m-p/98375#M39709</guid>
      <dc:creator>theanhdo</dc:creator>
      <dc:date>2024-11-11T21:22:37Z</dc:date>
    </item>
    <item>
      <title>Re: Run continuous job for a period of time</title>
      <link>https://community.databricks.com/t5/data-engineering/run-continuous-job-for-a-period-of-time/m-p/101801#M40831</link>
      <description>&lt;P&gt;There doesn't seem to be a proper way to do this currently.&lt;/P&gt;&lt;P&gt;We ended up running the job a couple of times in order to figure out some upper bound for run time, and just using that in the cron. Some jobs now run every 5 minutes during office hours, which is close enough for our usecase.&lt;/P&gt;&lt;P class=""&gt;This does cause issues with Skipped runs when compute is slow to spin up, so make sure you adjust any notifications accordingly.&lt;/P&gt;&lt;P class=""&gt;Alternatively, one could apply a Continuous schedule to the job, then toggle the schedule state for that job to ACTIVE at the start and to PAUSED at the end of the day using the Databricks API. We added two jobs in Databricks that call the API to toggle this state. Do test this thoroughly, or you'll have some costs waiting for you by Monday morning &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;&lt;P class=""&gt;This all feels very hacky for functionality that feels like it should be supported by default.&lt;/P&gt;</description>
      <pubDate>Wed, 11 Dec 2024 16:44:39 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/run-continuous-job-for-a-period-of-time/m-p/101801#M40831</guid>
      <dc:creator>eslaats</dc:creator>
      <dc:date>2024-12-11T16:44:39Z</dc:date>
    </item>
  </channel>
</rss>

