<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Streaming  with Kafka  with the same groupid in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/streaming-with-kafka-with-the-same-groupid/m-p/23350#M16099</link>
    <description>&lt;P&gt;By default, each streaming query generates a unique group ID for reading data ( ensuring it's own &amp;nbsp;its own consumer group&amp;nbsp;) . In scenarios where you'd want to specify it (authz etc ) , it is not recommended to have two streaming applications specify the same groupid. Spark keeps track of Kafka offsets internally and doesn’t commit any offset.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;In any case, for sources that doesn't support exactly once behaviour, with delta you could achieve idempotency via MERGE  &lt;/P&gt;</description>
    <pubDate>Wed, 23 Jun 2021 05:26:23 GMT</pubDate>
    <dc:creator>sajith_appukutt</dc:creator>
    <dc:date>2021-06-23T05:26:23Z</dc:date>
    <item>
      <title>Streaming  with Kafka  with the same groupid</title>
      <link>https://community.databricks.com/t5/data-engineering/streaming-with-kafka-with-the-same-groupid/m-p/23349#M16098</link>
      <description>&lt;P&gt;A kafka topic is having 300 partitions and I see two clusters are running and have the same group id, &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt; will the data be duplicate in my delta bonze layer&lt;/P&gt;</description>
      <pubDate>Thu, 17 Jun 2021 08:34:31 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/streaming-with-kafka-with-the-same-groupid/m-p/23349#M16098</guid>
      <dc:creator>User16826994223</dc:creator>
      <dc:date>2021-06-17T08:34:31Z</dc:date>
    </item>
    <item>
      <title>Re: Streaming  with Kafka  with the same groupid</title>
      <link>https://community.databricks.com/t5/data-engineering/streaming-with-kafka-with-the-same-groupid/m-p/23350#M16099</link>
      <description>&lt;P&gt;By default, each streaming query generates a unique group ID for reading data ( ensuring it's own &amp;nbsp;its own consumer group&amp;nbsp;) . In scenarios where you'd want to specify it (authz etc ) , it is not recommended to have two streaming applications specify the same groupid. Spark keeps track of Kafka offsets internally and doesn’t commit any offset.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;In any case, for sources that doesn't support exactly once behaviour, with delta you could achieve idempotency via MERGE  &lt;/P&gt;</description>
      <pubDate>Wed, 23 Jun 2021 05:26:23 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/streaming-with-kafka-with-the-same-groupid/m-p/23350#M16099</guid>
      <dc:creator>sajith_appukutt</dc:creator>
      <dc:date>2021-06-23T05:26:23Z</dc:date>
    </item>
  </channel>
</rss>

