12-30-2021 06:28 AM
Hi,
Is it possible to configure job throttling in order to queue jobs across a workspace after a given number of concurrent execution when using the ephemeral cluster pattern? The reason is mainly for cost control. We prefer reducing performance rather than increasing cost if too many jobs are executed for various reasons.
Thanks
12-30-2021 09:28 AM
Hello @E H - Welcome and thank you for asking. My name is Piper, and I'm a moderator for Databricks. Let's see what the other members have to say. If we don't hear anything, we'll swing back to this.
12-30-2021 04:18 PM
If you have a fixed size cluster, this will happen automatically. Just don't turn on autoscaling.
https://docs.databricks.com/clusters/configure.html#cluster-size-and-autoscaling
12-30-2021 04:34 PM
Thanks for the answer josephk.
However, this solution doesn't work in my case.
If I launch 20 different jobs, I will have 20 ephemeral clusters running at the same time. Hence, if they each run for 5 mins, we will incur a bill of 100 min execution.
The idea would be, for example, to have a maximum configurable execution time for each 30 min. Jobs would be queued afterwards. In my example, I could just take my configured execution time * 48 and it would give me the worst possible case in a single day.
I also tried a scenario using a pool with a maximum set of VMs. However, instead of queuing jobs, additional jobs failed since new VMs couldn't be provisioned.
The objective is to be able to predict cost by calculating the worst case in a single day and ensure that we don't go beyond that. Right now, the only way I can do this is using an interactive cluster (and paying more DBU) instead of a job cluster.
12-30-2021 04:38 PM
Why not just run all the jobs on the same cluster? That will save you a lot of time not starting up 20 clusters.
12-30-2021 04:42 PM
That is only possible with an interactive cluster (cost more DBU). At least as far as I know.
12-30-2021 04:49 PM
Yes, that's correct. There is a new feature in the roadmap to reuse the same cluster which should help/speed things up.
Might still be worth it to do it all on 1 interactive cluster, which again shouldn't be too expensive for a smaller cluster with single node.
12-30-2021 05:03 PM
Thanks for the help josephk. I will continue to use an interactive cluster for the time being until the release of that new feature. Hopefully, it will allow my use case. Is there visibility on the roadmap for an ETA or more information on it?
01-01-2022 04:00 PM
The image is from the roadmap that was released in November, so it should be in preview sometime this month if it isn't already. Talk to your CSE about the preview testing.
01-03-2022 06:50 AM
@E H that feature is in preview! Hit me up at bilal dot aslam at databricks dot com and I will get you enrolled in it.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group