<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: What is &amp;quot;ExecuteGrpcResponseSender: Deadline reached, shutting down stream&amp;quot; in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/what-is-quot-executegrpcresponsesender-deadline-reached-shutting/m-p/97307#M39472</link>
    <description>&lt;P&gt;This might be a bug. The issue is gone if I change the cluster from shared mode to single user mode&lt;/P&gt;</description>
    <pubDate>Fri, 01 Nov 2024 23:42:56 GMT</pubDate>
    <dc:creator>MikeGo</dc:creator>
    <dc:date>2024-11-01T23:42:56Z</dc:date>
    <item>
      <title>What is "ExecuteGrpcResponseSender: Deadline reached, shutting down stream"</title>
      <link>https://community.databricks.com/t5/data-engineering/what-is-quot-executegrpcresponsesender-deadline-reached-shutting/m-p/93678#M38743</link>
      <description>&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Hi,&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have a delta table which is loaded by structured streaming job. When I tried to read this delta table and do a MERGE with foreachBatch, I found sometimes there is a big interval between streaming starts and MERGE starting to run and seems spark is waiting for something. From log I can see&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;INFO ExecuteGrpcResponseSender: Starting for opId=5ef071b7-xxx, reattachable=true, lastConsumedStreamIndex=0
...
INFO SessionHolder: Session SessionKey(69xxx,04470efa-xxxx) accessed, time 1728792222507.
...
INFO ExecuteGrpcResponseSender: Deadline reached, shutting down stream for opId=5ef071b7-xxx after index 0. totalTime=120001284340ns waitingForResults=120001197790ns waitingForSend=0ns
INFO SessionHolder: Session SessionKey(69xxx,04470efa-xxxx) accessed, time 1728792342527.
INFO ExecuteGrpcResponseSender: Starting for opId=5ef071b7-xxx, reattachable=true, lastConsumedStreamIndex=0
...&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;there are many "&lt;SPAN&gt;INFO ExecuteGrpcResponseSender: Deadline reached, shutting down stream...&lt;/SPAN&gt;" and seems something is time out after 120s. I tried to set&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;spark.network.timeout: 800s
spark.streaming.backpressure.enabled: true&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;but still can find those deadline info in log.&amp;nbsp;&lt;BR /&gt;What happened here? Is there some config I can make to remove this as seems it slows down the job.&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;</description>
      <pubDate>Sun, 13 Oct 2024 04:35:36 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/what-is-quot-executegrpcresponsesender-deadline-reached-shutting/m-p/93678#M38743</guid>
      <dc:creator>MikeGo</dc:creator>
      <dc:date>2024-10-13T04:35:36Z</dc:date>
    </item>
    <item>
      <title>Re: What is "ExecuteGrpcResponseSender: Deadline reached, shutting down stream"</title>
      <link>https://community.databricks.com/t5/data-engineering/what-is-quot-executegrpcresponsesender-deadline-reached-shutting/m-p/97164#M39444</link>
      <description>&lt;P&gt;&lt;SPAN&gt;We need to understand,&amp;nbsp;&lt;/SPAN&gt;why upstream of the repl cancelled the request.&amp;nbsp;&lt;SPAN&gt;It could be resource exhaustion. Do you see "java.lang.OutOfMemoryError"?&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;I saw&amp;nbsp;&lt;A href="https://issues.apache.org/jira/browse/SPARK-49492" target="_blank"&gt;https://issues.apache.org/jira/browse/SPARK-49492&lt;/A&gt;&amp;nbsp;to be the cause of such an error in one of the past issues.&lt;/P&gt;
&lt;P&gt;Do you regularly see this issue, or is intermittent? Restart of the cluster will cause the issue to be mitigated, but to get review the logs you may have to enable cluster log delivery to investigate further.&lt;/P&gt;</description>
      <pubDate>Fri, 01 Nov 2024 06:37:05 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/what-is-quot-executegrpcresponsesender-deadline-reached-shutting/m-p/97164#M39444</guid>
      <dc:creator>NandiniN</dc:creator>
      <dc:date>2024-11-01T06:37:05Z</dc:date>
    </item>
    <item>
      <title>Re: What is "ExecuteGrpcResponseSender: Deadline reached, shutting down stream"</title>
      <link>https://community.databricks.com/t5/data-engineering/what-is-quot-executegrpcresponsesender-deadline-reached-shutting/m-p/97307#M39472</link>
      <description>&lt;P&gt;This might be a bug. The issue is gone if I change the cluster from shared mode to single user mode&lt;/P&gt;</description>
      <pubDate>Fri, 01 Nov 2024 23:42:56 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/what-is-quot-executegrpcresponsesender-deadline-reached-shutting/m-p/97307#M39472</guid>
      <dc:creator>MikeGo</dc:creator>
      <dc:date>2024-11-01T23:42:56Z</dc:date>
    </item>
    <item>
      <title>Re: What is "ExecuteGrpcResponseSender: Deadline reached, shutting down stream"</title>
      <link>https://community.databricks.com/t5/data-engineering/what-is-quot-executegrpcresponsesender-deadline-reached-shutting/m-p/97709#M39525</link>
      <description>&lt;P&gt;It may not necessarily be a bug, but some tuning due to architectural differences.&lt;/P&gt;
&lt;P&gt;What the message says is:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;The system was processing a gRPC operation identified by &lt;CODE&gt;opId=5ef071b7-xxx&lt;/CODE&gt;, and it set a deadline for that operation (likely 120 seconds).&lt;/LI&gt;
&lt;LI&gt;The operation didn't complete in time and exceeded the deadline, so the system has shut down the stream and stopped waiting for further results.&lt;/LI&gt;
&lt;LI&gt;The operation spent almost all of its time (around 120 seconds) waiting for results and did not spend any time in the process of sending data back to the client.&lt;/LI&gt;
&lt;LI&gt;It is an INFO message, indicating an event.&lt;/LI&gt;
&lt;LI&gt;Shared mode uses spark connect underlying, but single user mode does not and hence we do not see the logs in single user cluster.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;However, as our next steps:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;We can try to understand the cause of the delay on the external resource or service where the request is sent.&amp;nbsp;&lt;/LI&gt;
&lt;LI&gt;It is possible the timeouts are not same, or needs to be bumped up.&lt;/LI&gt;
&lt;LI&gt;Are there any other errors that you see along with these messages?&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;However, yes it may need more indepth look.&lt;/P&gt;</description>
      <pubDate>Tue, 05 Nov 2024 08:15:30 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/what-is-quot-executegrpcresponsesender-deadline-reached-shutting/m-p/97709#M39525</guid>
      <dc:creator>NandiniN</dc:creator>
      <dc:date>2024-11-05T08:15:30Z</dc:date>
    </item>
  </channel>
</rss>

