Running your EMR workloads on Elastigroup allows you to significantly reduce costs by reliably running your Big Data workloads on Spot Instances. Beyond cost savings, the Elastigroup/EMR integration also provides increased availability for your clusters. By running your EMR workloads on Elastigroup, you’re always running on the best possible mix of Reserved, On-Demand and Spot Instances. In a case of failure, Elastigroup will auto-heal your infrastructure and maintain maximum availability for minimum costs.
Today, we’re happy to announce two new enhancements to the Elastigroup/EMR integration:
You can now increase or decrease EMR group size on a predetermined schedule. This means you can process more data on specific times, and Elastigroup will scale to accommodate the schedule.
The scheduling option is now available on the Strategy & Compute screen when creating an EMR Elastigroup. For more information about scheduling, read this article.
Calculating the number of instances required for a job can be a tricky, time-consuming task. With Elastigroup, you can now assign a CPU amount you’d like EMR to use, and Elastigroup will ensure that the appropriate instances are available at any given time. This means that you don’t have to worry about specific instance types. You can let Elastigroup scale infrastructure behind the scenes and rest assured your EMR cluster will always have the amount of CPUs you’ve determined for the job.
To use CPU as the target unit for your EMR Elastigroup, use “Weight” as the unit under “Capacity” as shown here: