<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to stop a Streaming Job based on time of the week in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/how-to-stop-a-streaming-job-based-on-time-of-the-week/m-p/16130#M10342</link>
    <description>&lt;P&gt;Hi @Nolan Lavender​&amp;nbsp;, For e.g. if want to stop streaming on Saturday, you could do something like the below. Below is just a pseudo code.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt; .foreachBatch{ (batchDF: DataFrame, batchId: Long) =&amp;gt;&lt;/P&gt;&lt;P&gt; if (date_format(current_timestamp(), "u") == 6) { //run commands to maintain the table }&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Alternatively, You can calculate approximately how many micro batches are processed in a week and then you can periodically stop the streaming job. If your streaming is processing 100 microbatches in a week, then you can do something like below.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt; .foreachBatch{ (batchDF: DataFrame, batchId: Long) =&amp;gt;&lt;/P&gt;&lt;P&gt; if (batchId % 101 == 0) { //run commands to maintain the table }&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
    <pubDate>Mon, 20 Sep 2021 16:44:42 GMT</pubDate>
    <dc:creator>mathan_pillai</dc:creator>
    <dc:date>2021-09-20T16:44:42Z</dc:date>
    <item>
      <title>How to stop a Streaming Job based on time of the week</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-stop-a-streaming-job-based-on-time-of-the-week/m-p/16128#M10340</link>
      <description>&lt;P&gt;&lt;/P&gt;
&lt;P&gt;I have an always-on job cluster triggering Spark Streaming jobs. I would like to stop this streaming job once a week to run table maintenance. I was looking to leverage the foreachBatch function to check a condition and stop the job accordingly.&lt;/P&gt; 
&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 20 Aug 2021 20:51:20 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-stop-a-streaming-job-based-on-time-of-the-week/m-p/16128#M10340</guid>
      <dc:creator>nolanlavender00</dc:creator>
      <dc:date>2021-08-20T20:51:20Z</dc:date>
    </item>
    <item>
      <title>Re: How to stop a Streaming Job based on time of the week</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-stop-a-streaming-job-based-on-time-of-the-week/m-p/16130#M10342</link>
      <description>&lt;P&gt;Hi @Nolan Lavender​&amp;nbsp;, For e.g. if want to stop streaming on Saturday, you could do something like the below. Below is just a pseudo code.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt; .foreachBatch{ (batchDF: DataFrame, batchId: Long) =&amp;gt;&lt;/P&gt;&lt;P&gt; if (date_format(current_timestamp(), "u") == 6) { //run commands to maintain the table }&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Alternatively, You can calculate approximately how many micro batches are processed in a week and then you can periodically stop the streaming job. If your streaming is processing 100 microbatches in a week, then you can do something like below.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt; .foreachBatch{ (batchDF: DataFrame, batchId: Long) =&amp;gt;&lt;/P&gt;&lt;P&gt; if (batchId % 101 == 0) { //run commands to maintain the table }&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 20 Sep 2021 16:44:42 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-stop-a-streaming-job-based-on-time-of-the-week/m-p/16130#M10342</guid>
      <dc:creator>mathan_pillai</dc:creator>
      <dc:date>2021-09-20T16:44:42Z</dc:date>
    </item>
    <item>
      <title>Re: How to stop a Streaming Job based on time of the week</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-stop-a-streaming-job-based-on-time-of-the-week/m-p/75991#M35127</link>
      <description>&lt;P&gt;You could also use the&amp;nbsp;&lt;A class="" href="https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#triggers" target="_blank" rel="noopener noreferrer"&gt;"Available-now micro-batch" trigger&lt;/A&gt;&lt;SPAN&gt;. It only processes one batch at a time, and you can do whatever you want in between batches (sleep, shut down, vacuum, etc.)&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 28 Jun 2024 03:27:46 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-stop-a-streaming-job-based-on-time-of-the-week/m-p/75991#M35127</guid>
      <dc:creator>mroy</dc:creator>
      <dc:date>2024-06-28T03:27:46Z</dc:date>
    </item>
  </channel>
</rss>

