- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-14-2021 02:26 PM
We are having difficulties running our jobs with spot instances that get re-claimed by AWS during shuffles. Do we have any documentation / best-practices around this? We went through this article but is there anything else to keep in mind?
- Labels:
-
Spot
-
Spot instances
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-17-2021 04:15 PM
What are you setting your bid price to? I think its' reasonable to set it to 100% of on-demand price, or else you may get evicted more frequently. It's also a good idea for a job like this to set only _some_ of the executors to be spot instances, so that you never lose a critical mass of executors, while saving some money otherwise.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-25-2021 03:08 PM
Due to the recent changes in AWS spot market place , legacy techniques like higher spot bid price (>100%) are ineffective to retain the acquired spot node and the instances can be lost in 2 minutes notice causing workloads to fail.
To mitigate this, we should encourage customers to rely on -
- Using multiple instance families as part of their cluster/pool creation
- Provision master node from an on demand pool
- Consider using the appropriate spot allocation strategy like CAPACITY_OPTIMIZED, LOW_PRICE etc