<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: databricks job cancel does not wait for termination of streaming tasks in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/databricks-job-cancel-does-not-wait-for-termination-of-streaming/m-p/127445#M47972</link>
    <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/164253"&gt;@Vidhi_Khaitan&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;We are running reporting layer streamings 24/7 as serving near real time anayltics and executing merge statement in foreeachBatch. We can not opt the scheduling approach. I was exploring continuous as trigger in databricks job for streamings 24/7. Can it be possible solution for my case ? Will it gracefully stop or pause the streamings ?&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Tue, 05 Aug 2025 11:10:15 GMT</pubDate>
    <dc:creator>Sadam97</dc:creator>
    <dc:date>2025-08-05T11:10:15Z</dc:date>
    <item>
      <title>databricks job cancel does not wait for termination of streaming tasks</title>
      <link>https://community.databricks.com/t5/data-engineering/databricks-job-cancel-does-not-wait-for-termination-of-streaming/m-p/127427#M47965</link>
      <description>&lt;P&gt;We have created databricks jobs and each has multiple tasks. Each task is 24/7 running streaming with checkpoint enabled. We want it to be stateful when cancel and run the job but it seems like, when we cancel the job run it kill the parent process at OS level and does not wait for the streamings in each task to stop. We are having data missing issues between our reporting and staging layer. As we have to cancel and rerun reporting job for changes and addition, it seems like this causes data missing. To resolve this data missing issue, we have to recompute whole reporting layer by dropping the checkpoint which is very big bottleneck for us.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Is there a way to handle this issue, by prompting databricks job to wait for termination of streamings ?&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 05 Aug 2025 09:00:05 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/databricks-job-cancel-does-not-wait-for-termination-of-streaming/m-p/127427#M47965</guid>
      <dc:creator>Sadam97</dc:creator>
      <dc:date>2025-08-05T09:00:05Z</dc:date>
    </item>
    <item>
      <title>Re: databricks job cancel does not wait for termination of streaming tasks</title>
      <link>https://community.databricks.com/t5/data-engineering/databricks-job-cancel-does-not-wait-for-termination-of-streaming/m-p/127443#M47971</link>
      <description>&lt;P&gt;If the “reporting” layer is essentially micro-batching over bounded backlogs, run it with &lt;CODE data-start="3057" data-end="3071"&gt;availableNow&lt;/CODE&gt; (or a scheduled batch job) so each run is naturally &lt;EM data-start="3124" data-end="3133"&gt;bounded&lt;/EM&gt; and exits cleanly on its own, no manual cancel. This greatly reduces chances of partial micro-batches during redeploys.&lt;/P&gt;</description>
      <pubDate>Tue, 05 Aug 2025 10:57:40 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/databricks-job-cancel-does-not-wait-for-termination-of-streaming/m-p/127443#M47971</guid>
      <dc:creator>Vidhi_Khaitan</dc:creator>
      <dc:date>2025-08-05T10:57:40Z</dc:date>
    </item>
    <item>
      <title>Re: databricks job cancel does not wait for termination of streaming tasks</title>
      <link>https://community.databricks.com/t5/data-engineering/databricks-job-cancel-does-not-wait-for-termination-of-streaming/m-p/127445#M47972</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/164253"&gt;@Vidhi_Khaitan&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;We are running reporting layer streamings 24/7 as serving near real time anayltics and executing merge statement in foreeachBatch. We can not opt the scheduling approach. I was exploring continuous as trigger in databricks job for streamings 24/7. Can it be possible solution for my case ? Will it gracefully stop or pause the streamings ?&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 05 Aug 2025 11:10:15 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/databricks-job-cancel-does-not-wait-for-termination-of-streaming/m-p/127445#M47972</guid>
      <dc:creator>Sadam97</dc:creator>
      <dc:date>2025-08-05T11:10:15Z</dc:date>
    </item>
  </channel>
</rss>

