Like most of us, I have been in the cloud longer than I have thought about being "in the cloud". I've been using the entire suite of Google tools for a while, we use SalesForce.com at Valtira (and have been integrating our clients into SalesForce.com even longer) and Valtira itself is offered as a cloud service.
Only in the last year, however, have we fully embraced the idea of a cloud-based infrastructure. And money—specifically capital costs—made us do it.
A Little Background
For the purposes of this article, Valtira is a marketing platform that essentially is to the marketing department what SalesForce.com is to the sales department. We have always offered our software as a "cloud-like" service in which we take care of the infrastructure and system support. Unlike SalesForce.com or other typical cloud services, our problem domain had a nasty barrier to entry: people generally used our solution only when they were building a brand new web site. So our $10,000 software product ended up being part of a six figure budget decision and it required a lot of services before the customer was deployed.
We wanted to start making key parts of our infrastructure available on-demand in a way that would integrate well with a company's existing web infrastructure. As a result, in 2007, we created a plan for developing on-demand personalization widgets (which we call SmartSpot) and on-demand landing pages. This way, people could personalize their existing web sites, engage in marketing campaigns powered by custom landing pages, and integrate with all their marketing tools without any IT or development support.
The challenge was that this infrastructure needed to be high-availability (HA). We architected the Valtira platform to operate in a high-availability, clustered environment. But we always made clients with HA needs cover the costs for their infrastructure on their own. This time, we needed to buy an HA infrastructure for ourselves.
Valtira—both for better and (in this case) for worse— is entirely self-funded. We funnel our profits into growing the business instead of taking VC money. We simply could not afford the cash layout to invest in an HA infrastructure.
Looking at Cloud Options
We began researching mechanisms to avoid making the HA investment while still retaining an HA environment. We also had two start-up clients that had the need to move away from single dedicated hosts into HA environments—but they still had start-up budgets. We dismissed Google App Engine and other similar technologies that required you to write to a specific API. After all, we had a mature existing product and did not want to modify or re-write it to support some cloud provider.
Eventually, we stumbled into Amazon Web Services. We began with a small pilot program. To be honest, I thought the promise was way too good to be true. At each point, I expected to run into a "gotcha". We ran into several pseudo-gotchas that would have stopped a less motivated researcher. Among the ones we encountered were:
IP address management
IP addresses are all dynamically assigned, you have no netblock at any level, and (once upon a time) there was no option for static IP address assignment.
We believed we could automate a solution to this issue. We crafted some code in short order and then we were on to the next problem.
The lack of persistent storage
This problem no longer exists thanks to Amazon Elastic Block Storage. Once upon a time, however, there was no Amazon EBS. If you lost an instance for whatever reason, you lost the data. Game over.
If we were a big company with a lot of cash, this issue would have stopped us. It almost did; after all, Valtira is a database-driven application. We created a solution that essentially kept you MySQL slave synced with Amazon S3 (which was good enough for this particular use of the Valtira platform) and realized this solution had the virtue of providing automated disaster recovery.
The Valtira on-demand products do not really store any sensitive data. Once our enterprise clients heard what we were doing in the cloud, however, they started asking about making the move. They began asking a number of security questions about the cloud:
- How do you do intrusion detection?
- How do you manage authentication credentials?
- How do you VPN back into our backend services?
And many more. These problems seemed really hard, but we eventually overcame them as well.
Go or No Go
Eventually, we proved we could operate in the cloud for both our on-demand needs as well as most enterprise customer needs. The question was, did the costs work out?
To have put together an infrastructure to meet our needs, we would have needed to start with an F5 load balancer, two Dell application servers and two Dell database servers. We would have needed to purchase another rack at our hosting provider and purchased additional networking and firewall equipment. Taking this approach would have diverted money from other things in the company.
With the cloud, however, we did all of our prototyping on the cheap Amazon instances and even deployed our initial infrastructure on cheap instances (we would never have put our money into such puny servers for a hosted infrastructure). In short, we can achieve 99.99% availability (assuming Amazon lives up to its 99.95% SLA) for about $360/month.
We made the jump.
Reality in the Cloud
Valtira has been fully operational in the Amazon cloud since May.
We have identified two key learnings:
- the cloud is much more cost effective (don't let people's machine vs. machine comparisons fool you!)
- you need cloud management tools to make the thing work out right
If you compare the costs of an Amazon EC2 instance against the costs of a comparable piece of physical hardware, the EC2 instance almost always comes out being more expensive. Unfortunately, many of the cost analysis articles you see on the Internet engage in this kind of comparison.
The best way to compare costs is to look at how you would build out one infrastructure versus the other. As I noted above, for an HA Valtira on-demand infrastructure, I would have purchased a beefy firewall with beefy servers behind it. I simply don't have the option of trading them in easily without downtime and expense when my needs go up.
In the Amazon environment, however, I started with all of the cheap stuff. It worked well enough well past the point where we had proven the business model. At that point, it was a simple matter of restarting the instances with the beefier machine AMI.
I also don't have to buy a firewall, I can live with a cheap EC2 instance providing software load balancing, and don't have to buy any network equipment.
We ended up building our own tools because we had enterprise-level needs that we did not see out there on the market. Whether you write your own or use commercial ones, you will need the help of cloud infrastructure management tools. We spun ours off into another company, enStratus.
Cloud management tools are important because the cloud will eat up many more person-hours in support (and reducing your availability) than traditional infrastructure unless you have software taking care of all that. Thanks to the Amazon web services API, most everything—including disaster recover—becomes an automated process and ultimately saves you on manual work over a traditional infrastructure.