Two weeks ago, Amazon added a trio of long-awaited features to their Amazon Web Services offering: load balancing, "monitoring", and auto-scaling. All three pieces significantly enhance what you can do with an infrastructure deployed in AWS, but they have their limitations.
The core service in the bunch is Amazon CloudWatch. Amazon calls it a monitoring system, but I don't. I call it an analytics system because I think a monitoring system should alert you when it notices critical changes in your infrastructure. Definitions aside, CloudWatch is an immensely useful service. For $0.015/CPU, you can keep track of your instance health on a variety of dimensions like:
- CPU Usage
- Network I/O
- Disk I/O
Direct access to this information is useful only if you are using a tool like enStratus to pull it out for you. However, it's indirectly required for auto-scaling.
Elastic Load Balancing
AWS has gradually been virtualizing every piece of physical hardware that typically lives in an internal data center. The latest piece of hardware to be virtualized is the load balancer. Through Elastic Load Balancing, you can dynamically provision virtual load balancers that will spread network load equally across multiple AWS availability zones.
To understand the strengths and weaknesses of AWS Elastic Load Balancing, you have to understand the alternatives. Prior to this service, people would traditionally launch a cheap Amazon EC2 instance with Apache + modjk, Apache + HAProxy, Zeus, or a number of other software-based solutions. These options work well except for two key issues:
- EC2 instances fail, and a high-availability architecture based on EC2 instances with software load balancing is hard
- $0.10/CPU-hour is a bit pricey for running Apache + modjk
AWS Elastic Load Balancing comes in at $0.025/CPU hour plus $0.08/GB data transfer. With AWS Elastic Load Balancing, I expect (but have insufficient operational data to backup the expectation) that we will see significantly higher uptimes than with standard EC2 instances. My expectation is based on the architecture of the elastic load balancers that includes fault-tolerance (something lacking in EC2 instances) and lack of identity with a specific IP address. When you configure a load balancer, you map your external DNS using a CNAME to the DNS name provided by AWS.
I would also expect (again, lacking actual data) AWS elastic load balancers to perform much better under high stress than a cheap EC2 instance.
And Amazon elastic load balancers are MUCH cheaper than the cheap EC2 instances.
The elastic load balancers nevertheless have some important limitations.
- They lack the level of complex load balancing control provided by hardware load balancers or specialized load balancing software like Zeus
- Their ability to support SSL is very limited
- You cannot do name-based load balancing
Most people in Amazon are already living with the first limitation because EC2 instances do no better unless you are using specialized load balancing software. The last two—in particular the SSL concerns—are generally more significant.
Let's assume you have a traditional web application with the load balancer, application server, and database server. Before Elastic Load Balancing, you may have had Apache on an EC2 instance with your SSL certificates installed or name-based rules to route traffic meant for different applications to different ports.
You will have to move all SSL into the application server or into a proxy running on the application server. This approach may get really ugly with your SSL provider, as SSL certificates are often sold on a "per-server" basis. Thus, you will have to puchase the rights to install your SSL certificate on the maximum number of hosts to which you intend your infrastructure to scale. On the other hand, with EC2 host-based load balancing, you needed only a single SSL host certificate to support an infinite number of application servers.
The final issue—no name-based routing—impacts you if you are used to supporting multiple web sites or applications on the same IP address while running them on different backend ports on the application server. You might, for example, have www.imaginary.com listening on port 8080 of your application server and things.imaginary.com on port 8081 with the Apache-based load balancer listening routing port 80 traffic to one of the two application servers based on the Host value in the HTTP headers.
To accomplish the same result, you must route all port 80 traffic to an Apache reverse proxy running on each application server behind the load balancer. The reverse proxy would then perform the name-based routing.
In short, the Amazon Elastic Load Balancing gives you greater reliability and performance in exchange for higher application complexity and potentially greater cost (thanks to the increased cost of SSL certificates). Or potentially cheaper costs if you are not using SSL.
Amazon also introduced auto-scaling—for free!
With Amazon auto-scaling, you define auto-scaling groups based on pre-defined Amazon CloudWatch parameters and Amazon will automatically launch (or terminate) servers when certain thresholds are met. If you are using a load balancer, it will also automatically make sure the load balancer is aware of the changes.
I love the way Amazon implemented auto-scaling, but you should take care of how you implement this powerful tool in your own infrastructure. I still recommend some level of cloud infrastructure management software like enStratus as a way to provide context to a new instance when Amazon starts it up. For example, when your application server starts up, the database server needs to grant applications on the app server appropriate database access rights, your security group needs to enable traffic from the app server, and the app server needs to know how to talk to the current database master.
I'll finish with a caution from one of my posts from last year: don't engage in mindless auto-scaling without governors or any thought to capacity planning. The result could simply be a huge AWS bill without any practical benefits to your web site or your business.