<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Spark doesn't register executors when new workers are allocated in Get Started Discussions</title>
    <link>https://community.databricks.com/t5/get-started-discussions/spark-doesn-t-register-executors-when-new-workers-are-allocated/m-p/56997#M6334</link>
    <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/81460"&gt;@ivanychev&lt;/a&gt;&amp;nbsp; - Firstly, New workers are added and spark notice them hence, there is an init script logging in the event log stating the init script ran on the newly added workers.&amp;nbsp; For debugging, please check the Spark UI - executor tab.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Secondly, For Spot Instance termination, This is mostly by the cloud provider and spot instance price fluctuation. you can ideally use hybrid clusters (with spot fall back on demand) flag set on the cluster configuration page.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Reference:&amp;nbsp;&lt;A href="https://docs.databricks.com/en/compute/cluster-config-best-practices.html#on-demand-and-spot-instances" target="_blank"&gt;https://docs.databricks.com/en/compute/cluster-config-best-practices.html#on-demand-and-spot-instances&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;Thanks, Shan&lt;/P&gt;</description>
    <pubDate>Thu, 11 Jan 2024 21:43:54 GMT</pubDate>
    <dc:creator>shan_chandra</dc:creator>
    <dc:date>2024-01-11T21:43:54Z</dc:date>
    <item>
      <title>Spark doesn't register executors when new workers are allocated</title>
      <link>https://community.databricks.com/t5/get-started-discussions/spark-doesn-t-register-executors-when-new-workers-are-allocated/m-p/55041#M6333</link>
      <description>&lt;P&gt;Our pipelines sometimes get stuck (&lt;A href="https://dbc-419e990a-a2db.cloud.databricks.com/?o=714194483280371#job/552043212260350/run/323819797672209" target="_blank"&gt;example&lt;/A&gt;).&lt;/P&gt;&lt;P&gt;Some workers get decommissioned due to spot termination and then the new workers get added.&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Screenshot 2023-12-11 at 11.12.05.png" style="width: 400px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/5563i3B3FC1FC4F2C53D8/image-size/medium/is-moderation-mode/true?v=v2&amp;amp;px=400" role="button" title="Screenshot 2023-12-11 at 11.12.05.png" alt="Screenshot 2023-12-11 at 11.12.05.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt; However, after (1) Spark doesn't notice new executors:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Screenshot 2023-12-11 at 11.08.56.png" style="width: 400px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/5564i159D8C2E8EE32AC1/image-size/medium/is-moderation-mode/true?v=v2&amp;amp;px=400" role="button" title="Screenshot 2023-12-11 at 11.08.56.png" alt="Screenshot 2023-12-11 at 11.08.56.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt; And I don't know why. I don't understand how to debug this, but here're some of my observations:&lt;BR /&gt;&lt;BR /&gt;* The init script logs of the workers, which Spark doesn't notice, are fine, they complete successfully.&lt;/P&gt;&lt;P&gt;* The driver logs don't show anything significant after old executors get decomissioned. Driver simply doesn't notice new executors&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Screenshot 2023-12-11 at 11.48.50.png" style="width: 400px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/5565i610B67756275A076/image-size/medium/is-moderation-mode/true?v=v2&amp;amp;px=400" role="button" title="Screenshot 2023-12-11 at 11.48.50.png" alt="Screenshot 2023-12-11 at 11.48.50.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;How do I  debug this and what can be the issue?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 11 Dec 2023 10:49:55 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/spark-doesn-t-register-executors-when-new-workers-are-allocated/m-p/55041#M6333</guid>
      <dc:creator>ivanychev</dc:creator>
      <dc:date>2023-12-11T10:49:55Z</dc:date>
    </item>
    <item>
      <title>Re: Spark doesn't register executors when new workers are allocated</title>
      <link>https://community.databricks.com/t5/get-started-discussions/spark-doesn-t-register-executors-when-new-workers-are-allocated/m-p/56997#M6334</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/81460"&gt;@ivanychev&lt;/a&gt;&amp;nbsp; - Firstly, New workers are added and spark notice them hence, there is an init script logging in the event log stating the init script ran on the newly added workers.&amp;nbsp; For debugging, please check the Spark UI - executor tab.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Secondly, For Spot Instance termination, This is mostly by the cloud provider and spot instance price fluctuation. you can ideally use hybrid clusters (with spot fall back on demand) flag set on the cluster configuration page.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Reference:&amp;nbsp;&lt;A href="https://docs.databricks.com/en/compute/cluster-config-best-practices.html#on-demand-and-spot-instances" target="_blank"&gt;https://docs.databricks.com/en/compute/cluster-config-best-practices.html#on-demand-and-spot-instances&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;Thanks, Shan&lt;/P&gt;</description>
      <pubDate>Thu, 11 Jan 2024 21:43:54 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/spark-doesn-t-register-executors-when-new-workers-are-allocated/m-p/56997#M6334</guid>
      <dc:creator>shan_chandra</dc:creator>
      <dc:date>2024-01-11T21:43:54Z</dc:date>
    </item>
  </channel>
</rss>

