Great article. Cutting ec2 costs is important, especially for companies with heavy data engineering / data science workflows. Spark service providers make it easy to spin up huge clusters (100+ nodes) to perform ad hoc analyses. The costs can quickly spiral out of control, even if you're getting 3x cost savings on the spot market.
Some tangential thoughts:
* Is there an AWS API that returns the cheapest availability zone in a region for a given instance type? Or is the GUI that's screenshot in the blog the only way to see?
* I have seen the 90%+ cost savings for certain instance types
* Sometimes you lose a spot instance, look at the pricing history graph to confirm the price spike, and don't see any spike that was above your bid price... it can be frustrating
I'm not aware of any API (surprising for AWS), but you use Spot Fleet to get an array of spot instances optimized for cost.
Having a bid above spot price does not guarantee you'll keep the spot instance. AWS can terminate a spot instance at any time if they need the capacity—That's the deal. It used to be more closely tied your bid price, but they've been moving away from that.
Some tangential thoughts:
* Is there an AWS API that returns the cheapest availability zone in a region for a given instance type? Or is the GUI that's screenshot in the blog the only way to see?
* I have seen the 90%+ cost savings for certain instance types
* Sometimes you lose a spot instance, look at the pricing history graph to confirm the price spike, and don't see any spike that was above your bid price... it can be frustrating