I have gotten into heated discussions over the subject on Twitter (Follow Me). I enter into sales meetings getting clients excited about dynamic scaling only having to vigorously talk them away from the idea of auto-scaling. I just don't like auto-scaling.
What is auto-scaling?
Auto-scaling is the ability (with certain cloud infrastructure management tools like enStratus—in a limited beta through the end of the year) to add and remove capacity into a cloud infrastructure based on actual usage. No human intervention is necessary.
It sounds amazing—no more overloaded web sites. Just stick your site into the cloud and come what may! You just pay for what you use.
But I don't like auto-scaling.
What is dynamic scaling?
Auto-scaling takes advantage of a critical feature of the cloud called dynamic scaling. Dynamic scaling is the ability to add and remove capacity into your cloud infrastructure on a whim—ideally because you know your traffic patterns are about to change and you are adjusting accordingly.
I like dynamic scaling.
If you care about the scalability of your applications—whether in the cloud or in a managed hosting infrastructure or in an internal infrastructure—you should thoroughly understand capacity planning. If you don't, you should pick up the book by John Allspaw The Art of Capacity Planning.
In short, capacity planning is how you understand your traffic patterns, how they change periodically, how you expect them to grow, and what kind of infrastructure is necessary to support those traffic patterns. You cannot design any kind of infrastructure without doing proper capacity planning. Otherwise, you will overspend on infrastructure or you will get slashdotted* and lose money.
Capacity planning is critical largely because it enables you to tie infrastructure costs to the benefit the organization will see from different combinations of capacity and demand.
Consider an example in which you know you have an average demand requiring a single server but, for an hour out of the year, you need ten servers.
Even with the cloud, it is possible that you can never justify the spend on meeting the one hour of demand. On the other hand, it is possible that the cloud is just the tool you need to make meeting that demand cost-effective. Or it could be that that one hour is so critical to your business that it makes the rest of the year irrelevant.
Without proper capacity planning, you won't know what those traffic patterns are nor what kind of costs make sense to take on in support of those traffic patterns.
Can't you avoid capacity planning through auto-scaling?
No, you can't. Get the idea out of your head right now. For the most part, auto-scaling is nothing more than a crutch for those too lazy to do real capacity planning. True, if you configure your site for auto-scaling with no governors limiting the max capacity, you will never get slashdotted. And once sudden, unexpected volumes are reached, your infrastructure will return to its baseline configuration.
Here's why that's stupid:
1. Amazon and other clouds cannot respond fast enough to increased capacity needs.
It can take up to 10 minutes for your EC2 instances to launch. That's 10 minutes between when your cloud infrastructure management tool detects the need for extra capacity and the time when that capacity is actually available. That's 10 minutes of impaired performance for your customers (or perhaps even 10 minutes of downtime).
By the way, Amazon S3 has not proven itself to be the most stable of Amazon's cloud offerings. You could thus also discover yourself totally unable to add capacity at a critical time.
Guess what? Almost all capacity changes are foreseeable. If you had done proper capacity planning, you would have had two key advantages:
- You would have added the capacity before it was needed, guaranteeing that the proper capacity is always in place.
- You would have discovered any operational issues with Amazon S3 before they impacted your operations (and thus allow you to take alternative steps to deal with the situation).
2. Got any disgruntled employees, unhappy customers, or malicious competitors?
Here's an easy way to go broke: Set up auto-scaling with no governors limiting the maximum capacity. Any yahoo can then execute a distributed denial of service attack (DDoS) against your infrastructure. It won't take down your environment because your cloud provider almost certainly can withstand a reasonable attack. It will, however, cause you to add more and more servers into your infrastructure until you go broke.
3. So you think you'll stick some governors in place...
You definitely should never have auto-scaling without governors in place, but they really won't do you any good. They will simply respond to one of two events:
- Capacity demands you should have planned for, and thus don't need auto-scaling for.
- Capacity demands you could not have planned for, and thus you have no idea whether the governor level you have set is even appropriate to the traffic.
Sometimes traffic is truly unexpected. But not as often as you think. If you know you are getting coverage in some publication, marketing should have done an ROI projection on the campaign and be able to provide you with expected response rates.
There is, however, the rare occasion when you knew you were going to get coverage in one place, but another much larger venue (like Slashdot) suddenly picked up the story and ran with it. On this rare occasion, you really, really, really would like your site to scale to match the needs of this unexpected traffic.
But you don't want it to auto-scale. Auto-scaling cannot differentiate between valid traffic and non-sense. You can. If your environment is experiencing a sudden, unexpected spike in activity, the appropriate approach is to have minimal auto-scaling with governors in place, receive a notification from your cloud infrastructure management tools, then determinate what the best way to respond is going forward.
Here, the auto-scaling is simply a band-aid to enable a human to use dynamic scaling to define an appropriate, temporary capacity to support the unexpected change in demand.
5. Don't you lose a key value of the cloud without auto-scaling?
No. If you properly use dynamic scaling, you pay for exactly the capacity you need and nothing more. You still add capacity when you need it; you just add it according to a plan rather than willy-nilly based on perceived external events.
The dynamic scaling to plan can also be automated. If you know you have a batch window from midnight to 3am, set your cloud infrastructure management tools to add capacity pro-actively at 11:30 and throttle back at 3:30. You just don't want the system automatically adjusting capacity based on usage.
* Getting slashdotted is generally referred to as having a web site with modest activity suddenly go down due to a sudden influx of valid traffic—often due to coverage in a popular online publication like Slashdot.