Databricks Community

EDDatabricks · ‎02-13-2023

Hi all,

we have a databricks instance on Azure with a Compute Cluster version 7.3 LTS.

Currently the cluster has 4 max workers (min workers: 1) of type: Standard_D13_v2 and 1 driver of the same type. There are several jobs that are running on this cluster.

We are thinking to use Instance Pools to improve the initial spinup time of the cluster. If we set the Pool to max capacity 2 I understand that if the cluster tries to increase to 4 max workers it will fail and never increase.

See: https://learn.microsoft.com/en-gb/azure/databricks/clusters/instance-pools/configure.

Maximum Capacity
 
The maximum number of instances that the pool will provision. If set, this value constrains all instances (idle + used). If a cluster using the pool requests more instances than this number during autoscaling, the request will fail with an INSTANCE_POOL_MAX_CAPACITY_FAILURE error.

Another colleague tells me that the cluster will get the increase to 4 worker nodes, just at a slower start time (essentially acting as the pool did not exist but only for those workers). Can you please advise?

Furthermore to the above example, to accommodate the current setup of 4 max workers and 1 driver, we would actually need to set the max capacity to 5 to include the 4 workers + 1 driver. Can you please confirm my understanding or am I missing something?

Lakshay · ‎02-15-2023

Hi @EDDatabricks EDDatabricks , If you set the maximum capacity of pool to 5, then the pool will not be able to autoscale to more than 5 instances. Also, the max pool capacity is total number of instances including the driver and the worker. So, 5 instances will mean 4 workers and 1 driver.

However having said that, you don't necessarily need to configure the maximum capacity of a pool. This will allow pool to pull as many instances as needed. The purpose of maximum capacity is to make sure a pool does not create instances beyond a certain ensuring the pool does not cost more than expected.

View solution in original post

Lakshay · ‎02-15-2023

Hi @EDDatabricks EDDatabricks , If you set the maximum capacity of pool to 5, then the pool will not be able to autoscale to more than 5 instances. Also, the max pool capacity is total number of instances including the driver and the worker. So, 5 instances will mean 4 workers and 1 driver.

However having said that, you don't necessarily need to configure the maximum capacity of a pool. This will allow pool to pull as many instances as needed. The purpose of maximum capacity is to make sure a pool does not create instances beyond a certain ensuring the pool does not cost more than expected.

Anonymous · ‎02-16-2023

Hi @EDDatabricks EDDatabricks

Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help.

We'd love to hear from you.

Thanks!