<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Update code for a streaming job in Production in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/update-code-for-a-streaming-job-in-production/m-p/21117#M14352</link>
    <description>&lt;P&gt;Can you provide the source and sink type?&lt;/P&gt;</description>
    <pubDate>Wed, 10 Nov 2021 15:26:49 GMT</pubDate>
    <dc:creator>Sandeep</dc:creator>
    <dc:date>2021-11-10T15:26:49Z</dc:date>
    <item>
      <title>Update code for a streaming job in Production</title>
      <link>https://community.databricks.com/t5/data-engineering/update-code-for-a-streaming-job-in-production/m-p/21114#M14349</link>
      <description>&lt;P&gt;How to update a streaming job in production  with minimal/no downtime when there are significant code changes that may not be compatible with the existing checkpoint state to resume the stream processing? &lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 23 Jun 2021 21:52:55 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/update-code-for-a-streaming-job-in-production/m-p/21114#M14349</guid>
      <dc:creator>User16783853906</dc:creator>
      <dc:date>2021-06-23T21:52:55Z</dc:date>
    </item>
    <item>
      <title>Re: Update code for a streaming job in Production</title>
      <link>https://community.databricks.com/t5/data-engineering/update-code-for-a-streaming-job-in-production/m-p/21115#M14350</link>
      <description>&lt;P&gt;This will likely be use case/situation dependent. Can you provide an example of your current streaming setup and what kind of changes you anticipate that you'd like to perform with minimal downtime?&lt;/P&gt;</description>
      <pubDate>Wed, 23 Jun 2021 23:10:57 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/update-code-for-a-streaming-job-in-production/m-p/21115#M14350</guid>
      <dc:creator>aladda</dc:creator>
      <dc:date>2021-06-23T23:10:57Z</dc:date>
    </item>
    <item>
      <title>Re: Update code for a streaming job in Production</title>
      <link>https://community.databricks.com/t5/data-engineering/update-code-for-a-streaming-job-in-production/m-p/21116#M14351</link>
      <description>&lt;OL&gt;&lt;LI&gt;Please understand the code changes will support the existing checkpoint or else you need to go with the new checkpoint. More information on the type of changes: &lt;A href="https://docs.databricks.com/spark/latest/structured-streaming/production.html#types-of-changes" alt="https://docs.databricks.com/spark/latest/structured-streaming/production.html#types-of-changes" target="_blank"&gt;https://docs.databricks.com/spark/latest/structured-streaming/production.html#types-of-changes&lt;/A&gt;&lt;/LI&gt;&lt;LI&gt;If you are going with a new checkpoint then without mentioning any starting point for the&amp;nbsp;source to fetch, the framework will fetch the whole data from the source. In that case, you should be in a position to handle the duplicates or else duplicates will be added to the sink. To handle the duplicates, you can implement dropDruplicates or merge or row_number&amp;nbsp;based rank filtering of 1.&lt;/LI&gt;&lt;/OL&gt;</description>
      <pubDate>Thu, 16 Sep 2021 10:38:44 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/update-code-for-a-streaming-job-in-production/m-p/21116#M14351</guid>
      <dc:creator>Deepak_Bhutada</dc:creator>
      <dc:date>2021-09-16T10:38:44Z</dc:date>
    </item>
    <item>
      <title>Re: Update code for a streaming job in Production</title>
      <link>https://community.databricks.com/t5/data-engineering/update-code-for-a-streaming-job-in-production/m-p/21117#M14352</link>
      <description>&lt;P&gt;Can you provide the source and sink type?&lt;/P&gt;</description>
      <pubDate>Wed, 10 Nov 2021 15:26:49 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/update-code-for-a-streaming-job-in-production/m-p/21117#M14352</guid>
      <dc:creator>Sandeep</dc:creator>
      <dc:date>2021-11-10T15:26:49Z</dc:date>
    </item>
    <item>
      <title>Re: Update code for a streaming job in Production</title>
      <link>https://community.databricks.com/t5/data-engineering/update-code-for-a-streaming-job-in-production/m-p/21118#M14353</link>
      <description>&lt;P&gt;I have the same scenario, I am using source type as parquet and sink type as delta in Azure Data Lake Gen2. I need to change the checkpoint location, how can we exclude existing files ?. Without using autoloader feature can we do that, please confirm .&lt;/P&gt;&lt;P&gt;Please help asap&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks &lt;/P&gt;</description>
      <pubDate>Thu, 21 Jul 2022 10:33:18 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/update-code-for-a-streaming-job-in-production/m-p/21118#M14353</guid>
      <dc:creator>Himanshi</dc:creator>
      <dc:date>2022-07-21T10:33:18Z</dc:date>
    </item>
    <item>
      <title>Re: Update code for a streaming job in Production</title>
      <link>https://community.databricks.com/t5/data-engineering/update-code-for-a-streaming-job-in-production/m-p/21119#M14354</link>
      <description>&lt;P&gt;Thanks for the information, I will try to figure it out for more. Keep sharing such informative post keep suggesting such post.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://www.mahealthconnector.me/" alt="https://www.mahealthconnector.me/" target="_blank"&gt;MA Health Connector&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 25 Jul 2022 08:51:13 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/update-code-for-a-streaming-job-in-production/m-p/21119#M14354</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2022-07-25T08:51:13Z</dc:date>
    </item>
  </channel>
</rss>

