Spot instances - Best practice

Anonymous · ‎06-14-2021

We are having difficulties running our jobs with spot instances that get re-claimed by AWS during shuffles. Do we have any documentation / best-practices around this? We went through this article but is there anything else to keep in mind?

sean_owen · ‎06-17-2021

What are you setting your bid price to? I think its' reasonable to set it to 100% of on-demand price, or else you may get evicted more frequently. It's also a good idea for a job like this to set only _some_ of the executors to be spot instances, so that you never lose a critical mass of executors, while saving some money otherwise.

View solution in original post

User16783853906 · ‎06-25-2021

Due to the recent changes in AWS spot market place , legacy techniques like higher spot bid price (>100%) are ineffective to retain the acquired spot node and the instances can be lost in 2 minutes notice causing workloads to fail.

To mitigate this, we should encourage customers to rely on -

Using multiple instance families as part of their cluster/pool creation
Provision master node from an on demand pool
Consider using the appropriate spot allocation strategy like CAPACITY_OPTIMIZED, LOW_PRICE etc