Hi all,
we have a databricks instance on Azure with a Compute Cluster version 7.3 LTS.
Currently the cluster has 4 max workers (min workers: 1) of type: Standard_D13_v2 and 1 driver of the same type. There are several jobs that are running on this cluster.
We are thinking to use Instance Pools to improve the initial spinup time of the cluster. If we set the Pool to max capacity 2 I understand that if the cluster tries to increase to 4 max workers it will fail and never increase.
See: https://learn.microsoft.com/en-gb/azure/databricks/clusters/instance-pools/configure.
Maximum Capacity
The maximum number of instances that the pool will provision. If set, this value constrains all instances (idle + used). If a cluster using the pool requests more instances than this number during autoscaling, the request will fail with an INSTANCE_POOL_MAX_CAPACITY_FAILURE error.
Another colleague tells me that the cluster will get the increase to 4 worker nodes, just at a slower start time (essentially acting as the pool did not exist but only for those workers). Can you please advise?
Furthermore to the above example, to accommodate the current setup of 4 max workers and 1 driver, we would actually need to set the max capacity to 5 to include the 4 workers + 1 driver. Can you please confirm my understanding or am I missing something?