The possibility of finding the workload dynamically and spin up the cluster based on the workload

Arunsundar
Databricks Partner

Hi Team,

Good morning.

I would like to understand if there is a possibility to determine the workload automatically through code (data load from a file to a table, determine the file size, kind of a benchmark that we can check), based on which we can spin up a required optimal cluster type having control over the minimum/maximum number of workers required to complete the workload efficiently.

I also would like to understand whether cluster determination can be done only based on running the workload with a trial-and-error method by attaching various types of clusters in the Dev environment and arriving at the optimal cluster that we attach in higher environments.

Kindly let me know if you have any further questions.

Thanks