<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Parallel kafka consumer in spark structured streaming in Get Started Discussions</title>
    <link>https://community.databricks.com/t5/get-started-discussions/parallel-kafka-consumer-in-spark-structured-streaming/m-p/70276#M7277</link>
    <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;I have a spark streaming job which reads from kafka and process data and write to delta lake.&lt;/P&gt;&lt;P&gt;Number of kafka partition: 100&lt;/P&gt;&lt;P&gt;number of executor: 2 (4 core each)&lt;/P&gt;&lt;P&gt;So we have 8 cores total which are reading from 100 partitions of a topic. I wanted to understand if spark internally spin up muliple threads to reads from multiple partitions in parallel? if not is there any way to spin up multiple threads for kafka consumer.&lt;/P&gt;</description>
    <pubDate>Wed, 22 May 2024 14:29:11 GMT</pubDate>
    <dc:creator>subham0611</dc:creator>
    <dc:date>2024-05-22T14:29:11Z</dc:date>
    <item>
      <title>Parallel kafka consumer in spark structured streaming</title>
      <link>https://community.databricks.com/t5/get-started-discussions/parallel-kafka-consumer-in-spark-structured-streaming/m-p/70276#M7277</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;I have a spark streaming job which reads from kafka and process data and write to delta lake.&lt;/P&gt;&lt;P&gt;Number of kafka partition: 100&lt;/P&gt;&lt;P&gt;number of executor: 2 (4 core each)&lt;/P&gt;&lt;P&gt;So we have 8 cores total which are reading from 100 partitions of a topic. I wanted to understand if spark internally spin up muliple threads to reads from multiple partitions in parallel? if not is there any way to spin up multiple threads for kafka consumer.&lt;/P&gt;</description>
      <pubDate>Wed, 22 May 2024 14:29:11 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/parallel-kafka-consumer-in-spark-structured-streaming/m-p/70276#M7277</guid>
      <dc:creator>subham0611</dc:creator>
      <dc:date>2024-05-22T14:29:11Z</dc:date>
    </item>
  </channel>
</rss>

