<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: which type of cluster to use in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/which-type-of-cluster-to-use/m-p/104657#M41833</link>
    <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/98941"&gt;@Avinash_Narala&lt;/a&gt;&amp;nbsp;, Good Day!&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;For right-sizing the cluster, the recommended approach is a hybrid approach for node provisioning in the cluster along with autoscaling. This involves defining the number of on-demand instances and spot instances for the cluster and enabling autoscaling between the minimum and the maximum number of instances. This allows the cluster to scale up and down depending on the load. Also, please refer to the below documents for more information.&lt;/SPAN&gt;&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;A href="https://docs.databricks.com/lakehouse-architecture/cost-optimization/best-practices.html" target="_blank" rel="noopener"&gt;Databricks Cost Optimization Best Practices&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A href="https://docs.databricks.com/clusters/cluster-config-best-practices.html" target="_blank" rel="noopener"&gt;Databricks Cluster Configuration Best Practices&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Please let me know if this helps and leave a like if this information is useful, followups are appreciated.&lt;BR /&gt;Kudos&lt;BR /&gt;Ayushi&lt;/P&gt;</description>
    <pubDate>Wed, 08 Jan 2025 08:35:24 GMT</pubDate>
    <dc:creator>Ayushi_Suthar</dc:creator>
    <dc:date>2025-01-08T08:35:24Z</dc:date>
    <item>
      <title>which type of cluster to use</title>
      <link>https://community.databricks.com/t5/data-engineering/which-type-of-cluster-to-use/m-p/104650#M41829</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;Recently, I had some logic to collect the dataframe and process row by row. I am using 128GB driver node but it is taking significantly more time (like 2 hours for just 700 rows of data).&lt;/P&gt;&lt;P&gt;May I know which type of cluster should I use and the driver size?&lt;/P&gt;</description>
      <pubDate>Wed, 08 Jan 2025 07:24:47 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/which-type-of-cluster-to-use/m-p/104650#M41829</guid>
      <dc:creator>Avinash_Narala</dc:creator>
      <dc:date>2025-01-08T07:24:47Z</dc:date>
    </item>
    <item>
      <title>Re: which type of cluster to use</title>
      <link>https://community.databricks.com/t5/data-engineering/which-type-of-cluster-to-use/m-p/104657#M41833</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/98941"&gt;@Avinash_Narala&lt;/a&gt;&amp;nbsp;, Good Day!&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;For right-sizing the cluster, the recommended approach is a hybrid approach for node provisioning in the cluster along with autoscaling. This involves defining the number of on-demand instances and spot instances for the cluster and enabling autoscaling between the minimum and the maximum number of instances. This allows the cluster to scale up and down depending on the load. Also, please refer to the below documents for more information.&lt;/SPAN&gt;&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;A href="https://docs.databricks.com/lakehouse-architecture/cost-optimization/best-practices.html" target="_blank" rel="noopener"&gt;Databricks Cost Optimization Best Practices&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A href="https://docs.databricks.com/clusters/cluster-config-best-practices.html" target="_blank" rel="noopener"&gt;Databricks Cluster Configuration Best Practices&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Please let me know if this helps and leave a like if this information is useful, followups are appreciated.&lt;BR /&gt;Kudos&lt;BR /&gt;Ayushi&lt;/P&gt;</description>
      <pubDate>Wed, 08 Jan 2025 08:35:24 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/which-type-of-cluster-to-use/m-p/104657#M41833</guid>
      <dc:creator>Ayushi_Suthar</dc:creator>
      <dc:date>2025-01-08T08:35:24Z</dc:date>
    </item>
  </channel>
</rss>

