cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Cluster Sizing

Anonymous
Not applicable

How big should my cluster be? How do I know how many nodes to use or the kind of instance to use?

1 ACCEPTED SOLUTION

Accepted Solutions

Anonymous
Not applicable

It depends a lot on the use case - ETL vs machine learning and training, etc. It also depends on the size of your data etc.

There's a good section on the documentation to give you some general guidance and process for sizing:

Best practices: Cluster configuration | Databricks on AWS

View solution in original post

3 REPLIES 3

Anonymous
Not applicable

It depends a lot on the use case - ETL vs machine learning and training, etc. It also depends on the size of your data etc.

There's a good section on the documentation to give you some general guidance and process for sizing:

Best practices: Cluster configuration | Databricks on AWS

This article is based in part on the course produced by Databricks Academy called Optimizing Apache Spark on Databricks. These courses are 100% free, but also goes a bit deeper into the considerations required for making this decision, including usage, cloud costs and SLAs

sajith_appukutt
Honored Contributor II

>How big should my cluster be? 

This would really depend on the use case. Some general guiding principles could be found here https://docs.databricks.com/clusters/cluster-config-best-practices.html

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.