<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Driver and worker node utalisation in Get Started Discussions</title>
    <link>https://community.databricks.com/t5/get-started-discussions/driver-and-worker-node-utalisation/m-p/75883#M7691</link>
    <description>&lt;P&gt;Hello&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/105077"&gt;@pjv&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Greetings!&lt;/P&gt;
&lt;P&gt;The choice between having a worker node of the same type as your driver node or using a single-node compute cluster depends on the nature of your workload.&lt;/P&gt;
&lt;P&gt;Suppose you are running normal Python code on a notebook as a job on this cluster, and your workload is not distributed (i.e., it doesn't require Spark's distributed computing capabilities). In that case, a single-node compute cluster might be sufficient. In a single-node compute cluster, the driver acts as both master and worker, with no worker nodes. This is intended for jobs that use small amounts of data or non-distributed workloads such as single-node machine learning libraries.&lt;/P&gt;
&lt;P&gt;However, if your workload involves large-scale data processing or distributed machine learning tasks, having worker nodes of the same type as your driver node could improve performance. In a multi-node compute, worker nodes run the Spark executors and other services required for proper functioning compute. All the distributed processing happens on worker nodes.&lt;/P&gt;
&lt;P&gt;I hope this answers your query!&lt;/P&gt;
&lt;P&gt;Kind regards,&lt;/P&gt;
&lt;P&gt;Ravi&lt;/P&gt;</description>
    <pubDate>Wed, 26 Jun 2024 19:03:37 GMT</pubDate>
    <dc:creator>Ravivarma</dc:creator>
    <dc:date>2024-06-26T19:03:37Z</dc:date>
    <item>
      <title>Driver and worker node utalisation</title>
      <link>https://community.databricks.com/t5/get-started-discussions/driver-and-worker-node-utalisation/m-p/75855#M7689</link>
      <description>&lt;P&gt;Hi all!&amp;nbsp;&lt;/P&gt;&lt;P&gt;Can anyone tell me if having a worker node(s) of the same type as my driver node make a difference performance-wise if I am running normal python code on a notebook as a job on this cluster? I am running mostly machine learning libraries such as torch/huggingface. Or is a single node compute cluster sufficient i.e. only a single driver/worker node? Thank you!&lt;/P&gt;&lt;P&gt;Kind regards,&lt;/P&gt;&lt;P&gt;pjv&lt;/P&gt;</description>
      <pubDate>Wed, 26 Jun 2024 13:53:31 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/driver-and-worker-node-utalisation/m-p/75855#M7689</guid>
      <dc:creator>pjv</dc:creator>
      <dc:date>2024-06-26T13:53:31Z</dc:date>
    </item>
    <item>
      <title>Re: Driver and worker node utalisation</title>
      <link>https://community.databricks.com/t5/get-started-discussions/driver-and-worker-node-utalisation/m-p/75880#M7690</link>
      <description>&lt;P&gt;Hello&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/105077"&gt;@pjv&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;
&lt;P&gt;Greetings of the day!&lt;/P&gt;
&lt;P&gt;The choice between having a worker node of the same type as your driver node or using a single-node compute cluster depends on the nature of your workload.&lt;/P&gt;
&lt;P&gt;Suppose you are running normal Python code on a notebook as a job on this cluster, and your workload is not distributed (i.e., it doesn't require Spark's distributed computing capabilities). In that case, a single-node compute cluster might be sufficient. In a single-node compute cluster, the driver acts as both master and worker, with no worker nodes. This is intended for jobs that use small amounts of data or non-distributed workloads such as single-node machine learning libraries.&lt;BR /&gt;However, if your workload involves large-scale data processing or distributed machine learning tasks, having worker nodes of the same type as your driver node could improve performance. In a multi-node compute, worker nodes run the Spark executors and other services required for proper functioning compute. All the distributed processing happens on worker nodes.&lt;/P&gt;
&lt;P&gt;I hope this clears your doubts!&lt;/P&gt;</description>
      <pubDate>Wed, 26 Jun 2024 18:59:26 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/driver-and-worker-node-utalisation/m-p/75880#M7690</guid>
      <dc:creator>Ravivarma</dc:creator>
      <dc:date>2024-06-26T18:59:26Z</dc:date>
    </item>
    <item>
      <title>Re: Driver and worker node utalisation</title>
      <link>https://community.databricks.com/t5/get-started-discussions/driver-and-worker-node-utalisation/m-p/75883#M7691</link>
      <description>&lt;P&gt;Hello&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/105077"&gt;@pjv&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Greetings!&lt;/P&gt;
&lt;P&gt;The choice between having a worker node of the same type as your driver node or using a single-node compute cluster depends on the nature of your workload.&lt;/P&gt;
&lt;P&gt;Suppose you are running normal Python code on a notebook as a job on this cluster, and your workload is not distributed (i.e., it doesn't require Spark's distributed computing capabilities). In that case, a single-node compute cluster might be sufficient. In a single-node compute cluster, the driver acts as both master and worker, with no worker nodes. This is intended for jobs that use small amounts of data or non-distributed workloads such as single-node machine learning libraries.&lt;/P&gt;
&lt;P&gt;However, if your workload involves large-scale data processing or distributed machine learning tasks, having worker nodes of the same type as your driver node could improve performance. In a multi-node compute, worker nodes run the Spark executors and other services required for proper functioning compute. All the distributed processing happens on worker nodes.&lt;/P&gt;
&lt;P&gt;I hope this answers your query!&lt;/P&gt;
&lt;P&gt;Kind regards,&lt;/P&gt;
&lt;P&gt;Ravi&lt;/P&gt;</description>
      <pubDate>Wed, 26 Jun 2024 19:03:37 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/driver-and-worker-node-utalisation/m-p/75883#M7691</guid>
      <dc:creator>Ravivarma</dc:creator>
      <dc:date>2024-06-26T19:03:37Z</dc:date>
    </item>
    <item>
      <title>Re: Driver and worker node utalisation</title>
      <link>https://community.databricks.com/t5/get-started-discussions/driver-and-worker-node-utalisation/m-p/75884#M7692</link>
      <description>&lt;P&gt;The choice between having a worker node of the same type as your driver node or using a single-node compute cluster depends on the nature of your workload.&lt;/P&gt;
&lt;P&gt;Suppose you are running normal Python code on a notebook as a job on this cluster, and your workload is not distributed (i.e., it doesn't require Spark's distributed computing capabilities). In that case, a single-node compute cluster might be sufficient. In a single-node compute cluster, the driver acts as both master and worker, with no worker nodes. This is intended for jobs that use small amounts of data or non-distributed workloads such as single-node machine learning libraries.&lt;/P&gt;
&lt;P&gt;However, if your workload involves large-scale data processing or distributed machine learning tasks, having worker nodes of the same type as your driver node could improve performance. In a multi-node compute, worker nodes run the Spark executors and other services required for proper functioning compute. All the distributed processing happens on worker nodes.&lt;/P&gt;</description>
      <pubDate>Wed, 26 Jun 2024 19:05:37 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/driver-and-worker-node-utalisation/m-p/75884#M7692</guid>
      <dc:creator>Ravivarma</dc:creator>
      <dc:date>2024-06-26T19:05:37Z</dc:date>
    </item>
    <item>
      <title>Re: Driver and worker node utalisation</title>
      <link>https://community.databricks.com/t5/get-started-discussions/driver-and-worker-node-utalisation/m-p/75889#M7693</link>
      <description>&lt;P&gt;Hello&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/105077"&gt;@pjv&lt;/a&gt;&amp;nbsp;,&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Python code will run solely on the driver node, for this case only the driver compute matters.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you're submitting any pyspark transformations or actions through your code it will generate spark plans that will be later executed via the executors.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 26 Jun 2024 19:55:19 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/driver-and-worker-node-utalisation/m-p/75889#M7693</guid>
      <dc:creator>raphaelblg</dc:creator>
      <dc:date>2024-06-26T19:55:19Z</dc:date>
    </item>
    <item>
      <title>Re: Driver and worker node utalisation</title>
      <link>https://community.databricks.com/t5/get-started-discussions/driver-and-worker-node-utalisation/m-p/75922#M7694</link>
      <description>&lt;P&gt;Thanks&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/97998"&gt;@raphaelblg&lt;/a&gt;&amp;nbsp;! That makes sense.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 27 Jun 2024 07:05:14 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/driver-and-worker-node-utalisation/m-p/75922#M7694</guid>
      <dc:creator>pjv</dc:creator>
      <dc:date>2024-06-27T07:05:14Z</dc:date>
    </item>
    <item>
      <title>Re: Driver and worker node utalisation</title>
      <link>https://community.databricks.com/t5/get-started-discussions/driver-and-worker-node-utalisation/m-p/78049#M7695</link>
      <description>&lt;P&gt;Yes, you are right, I agree with you.&lt;/P&gt;</description>
      <pubDate>Wed, 10 Jul 2024 07:58:48 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/driver-and-worker-node-utalisation/m-p/78049#M7695</guid>
      <dc:creator>SeanRay</dc:creator>
      <dc:date>2024-07-10T07:58:48Z</dc:date>
    </item>
  </channel>
</rss>

