โ12-07-2022 06:15 AM
โ12-07-2022 08:49 AM
โ12-07-2022 10:35 AM
I like the scale-up before the scale-out.
I was used to run multiple clusters, but it does make sense. I like this page:
https://docs.databricks.com/clusters/cluster-config-best-practices.html
Thanks,
Pat.
โ12-08-2022 12:57 AM
Thanks @Joseph Kambourakisโ for the inputs
โ12-07-2022 10:13 AM
The biggest factor is cost for compute. I start simple and adjust as needed. However if one block of code is creating a performance issue then that needs to be addressed as no cluster can make bad code better.
In general I analyze the overall runtime of a workflow and test different cluster sizes and instances types. After a few runs I check the metrics and see how its performing during the job and make adjustments to the instance types as necessary.
Some cases are special and need to be configured for the code you will be running. JDBC jobs for example need to configured for number of cores if you are looking to run on all nodes for ETL.
For BI platforms and Databricks SQL warehouses these clusters need to be monitored at the query level. If a query runs for several hours but the execution time is a few minutes. I'd create a smaller cluster for it as most of the time is spent waiting on the BI platform to ingest the data.
For ML it all depends on the models and data. Start simple and adjust as needed. Some libraries and packages may need GPUs and some may not need more than a single instance.
for what its worth some operations will store a lot of info on the master node I set a spark config to make all but 1GB of memory using spark.driver.maxResultSize
โ12-08-2022 12:28 AM
Can you please help on below points ?
โ12-08-2022 12:59 AM
@Bharath Kumar Ramachandranโ going with job cluster will be cheaper i believe.
โ12-08-2022 01:14 AM
In my project, we generally decide cluster based on the data, complexity of the code, and time.
โ12-08-2022 08:51 AM
@Ajay Pandeyโ great
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโt want to miss the chance to attend and share knowledge.
If there isnโt a group near you, start one and help create a community that brings people together.
Request a New Group