<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to improve  Spark Streaming writer Input Rate and Processing rate? in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/how-to-improve-spark-streaming-writer-input-rate-and-processing/m-p/21881#M14949</link>
    <description>&lt;P&gt;&lt;B&gt;setMaxEventsPerTrigger not&amp;nbsp;equal to&amp;nbsp;numInputRow is my problem﻿&lt;/B&gt;&lt;/P&gt;</description>
    <pubDate>Wed, 04 May 2022 07:51:44 GMT</pubDate>
    <dc:creator>RengarLee</dc:creator>
    <dc:date>2022-05-04T07:51:44Z</dc:date>
    <item>
      <title>How to improve  Spark Streaming writer Input Rate and Processing rate?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-improve-spark-streaming-writer-input-rate-and-processing/m-p/21876#M14944</link>
      <description>&lt;P&gt;Hi!&lt;/P&gt;&lt;P&gt;I have many questions about Spark Streaming and Evnethub。&lt;/P&gt;&lt;P&gt;Can you help me?&lt;/P&gt;&lt;P&gt;&lt;B&gt;Q1:How to improve&amp;nbsp;Spark Streaming writer Input Rate and Processing rate?&lt;/B&gt;&lt;/P&gt;&lt;P&gt;I connect Azure Eventhubs using Spark Streaming(Azure Databricks), but I found if I use display, this input rate is very quick, if I use writer is very Slow. the result this Prcture.1, the code this Picture.2 and Picture.3. I want to improve the writer input rate and processing rate to the extent that the outgoing bytes are Greater than&amp;nbsp;the Incoming bytes in the event hub, like the display.&lt;/P&gt;&lt;P&gt;What should I do?&amp;nbsp;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;B&gt;Q2: setMaxEventsPerTrigger not&amp;nbsp; equal to &amp;nbsp;numInputRow?&lt;/B&gt;&lt;/P&gt;&lt;P&gt;I set 10000 to setMaxEventsPerTrigger&amp;nbsp;on eventhubsConf, but why numInputRow inside RawData is 1000, like the Prcture.5.setMaxEventsPerTrigger not&amp;nbsp; equal to &amp;nbsp;numInputRow?&lt;/P&gt;</description>
      <pubDate>Fri, 29 Apr 2022 04:13:56 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-improve-spark-streaming-writer-input-rate-and-processing/m-p/21876#M14944</guid>
      <dc:creator>RengarLee</dc:creator>
      <dc:date>2022-04-29T04:13:56Z</dc:date>
    </item>
    <item>
      <title>Re: How to improve  Spark Streaming writer Input Rate and Processing rate?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-improve-spark-streaming-writer-input-rate-and-processing/m-p/21877#M14945</link>
      <description>&lt;P&gt;Hi @Rengar Lee​&amp;nbsp;,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;How many Eventhubs partitions are you reading from? check your Ganglia UI to check your cluster utilization. also, whats the time it takes to write the data to the sink? you can get the query metrics from the Spark logs. &lt;/P&gt;</description>
      <pubDate>Fri, 29 Apr 2022 21:12:49 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-improve-spark-streaming-writer-input-rate-and-processing/m-p/21877#M14945</guid>
      <dc:creator>jose_gonzalez</dc:creator>
      <dc:date>2022-04-29T21:12:49Z</dc:date>
    </item>
    <item>
      <title>Re: How to improve  Spark Streaming writer Input Rate and Processing rate?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-improve-spark-streaming-writer-input-rate-and-processing/m-p/21878#M14946</link>
      <description>&lt;P&gt;&lt;B&gt;How many Eventhubs partitions are you reading from?&lt;/B&gt;&lt;/P&gt;&lt;P&gt;Only 1 partition. &lt;/P&gt;&lt;P&gt;&lt;B&gt;Check your Ganglia UI to check your cluster utilization&lt;/B&gt;&lt;/P&gt;&lt;P&gt;Prcture.1&lt;/P&gt;&lt;P&gt;&lt;B&gt;What's the time it takes to write the data to the sink?&lt;/B&gt;&lt;/P&gt;&lt;P&gt;Prcture.2&lt;/P&gt;&lt;P&gt;​&lt;/P&gt;&lt;P&gt;I found that no matter how much setMaxEventsPerTrigger the numInputRow is 1000.&lt;/P&gt;&lt;P&gt;Can't finish writing in 1 second， which causes a delay for the next write， so event hubs'&amp;nbsp;outgoing bytes are Low.&lt;/P&gt;&lt;P&gt;if I can set numInputRow to exceed 1000， I think the question can resolve.&lt;/P&gt;&lt;P&gt;Event​Hubs is in Prcture3.&lt;/P&gt;&lt;P&gt;​&lt;/P&gt;&lt;P&gt;​&lt;/P&gt;&lt;P&gt;​&lt;/P&gt;</description>
      <pubDate>Sat, 30 Apr 2022 00:06:25 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-improve-spark-streaming-writer-input-rate-and-processing/m-p/21878#M14946</guid>
      <dc:creator>RengarLee</dc:creator>
      <dc:date>2022-04-30T00:06:25Z</dc:date>
    </item>
    <item>
      <title>Re: How to improve  Spark Streaming writer Input Rate and Processing rate?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-improve-spark-streaming-writer-input-rate-and-processing/m-p/21879#M14947</link>
      <description />
      <pubDate>Sat, 30 Apr 2022 00:06:37 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-improve-spark-streaming-writer-input-rate-and-processing/m-p/21879#M14947</guid>
      <dc:creator>RengarLee</dc:creator>
      <dc:date>2022-04-30T00:06:37Z</dc:date>
    </item>
    <item>
      <title>Re: How to improve  Spark Streaming writer Input Rate and Processing rate?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-improve-spark-streaming-writer-input-rate-and-processing/m-p/21880#M14948</link>
      <description />
      <pubDate>Sat, 30 Apr 2022 00:06:51 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-improve-spark-streaming-writer-input-rate-and-processing/m-p/21880#M14948</guid>
      <dc:creator>RengarLee</dc:creator>
      <dc:date>2022-04-30T00:06:51Z</dc:date>
    </item>
    <item>
      <title>Re: How to improve  Spark Streaming writer Input Rate and Processing rate?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-improve-spark-streaming-writer-input-rate-and-processing/m-p/21881#M14949</link>
      <description>&lt;P&gt;&lt;B&gt;setMaxEventsPerTrigger not&amp;nbsp;equal to&amp;nbsp;numInputRow is my problem﻿&lt;/B&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 04 May 2022 07:51:44 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-improve-spark-streaming-writer-input-rate-and-processing/m-p/21881#M14949</guid>
      <dc:creator>RengarLee</dc:creator>
      <dc:date>2022-05-04T07:51:44Z</dc:date>
    </item>
  </channel>
</rss>

