Job queue for pool limit

andyh · ‎11-09-2023

I have a cluster pool with a max capacity limit, to make sure we're not burning too extra silicon. We use this for some of our less critical workflow/jobs. They still spend a lot of time idle, but sometimes hit this max capacity limit. Is there a way to get a job to wait for an available pool instance, rather than automatically failing with an

instance_pool_error_code:INSTANCE_POOL_MAX_CAPACITY_FAILURE

?

karthik_p · ‎11-10-2023

@andyh did you get a chance to check queue in jobs, that may help, will update if we have any other options

karthik.p

SSundaram · ‎11-10-2023

Try increasing your max capacity limit and might want to bring down the min number of nodes the job uses.

At the job level try configuring retry and time interval between retries.

View solution in original post