cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

which type of cluster to use

Avinash_Narala
Contributor

Hi,

Recently, I had some logic to collect the dataframe and process row by row. I am using 128GB driver node but it is taking significantly more time (like 2 hours for just 700 rows of data).

May I know which type of cluster should I use and the driver size?

1 ACCEPTED SOLUTION

Accepted Solutions

Ayushi_Suthar
Databricks Employee
Databricks Employee

Hi @Avinash_Narala , Good Day! 

For right-sizing the cluster, the recommended approach is a hybrid approach for node provisioning in the cluster along with autoscaling. This involves defining the number of on-demand instances and spot instances for the cluster and enabling autoscaling between the minimum and the maximum number of instances. This allows the cluster to scale up and down depending on the load. Also, please refer to the below documents for more information.

Please let me know if this helps and leave a like if this information is useful, followups are appreciated.
Kudos
Ayushi

View solution in original post

1 REPLY 1

Ayushi_Suthar
Databricks Employee
Databricks Employee

Hi @Avinash_Narala , Good Day! 

For right-sizing the cluster, the recommended approach is a hybrid approach for node provisioning in the cluster along with autoscaling. This involves defining the number of on-demand instances and spot instances for the cluster and enabling autoscaling between the minimum and the maximum number of instances. This allows the cluster to scale up and down depending on the load. Also, please refer to the below documents for more information.

Please let me know if this helps and leave a like if this information is useful, followups are appreciated.
Kudos
Ayushi

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group