Run driver on spot instance
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-18-2023 07:28 AM
The traditional advice seems to be to run the driver on "on demand", and optionally the workers on spot. And this is indeed what happends if one chooses to run with spot instances in Databricks. But I am interested in what happens if we run with a driver which gets evicted? Can we end up with corrupt data?
We have some batch jobs which run as structured streaming every night. They seem like prime candidates to run on 100% spot with retries, but I want to understand why this is not a more common pattern first.