This is a follow up article to a piece I wrote in 2007, "Why I Stopped Coding and Why I'd Start Again". I took a several year hiatus from writing code, mostly because I found that I was spending more time fighting with development tools than I was working on the problem I was trying to solve. Anyone who has managed systems knows what I am talking about. You can spend hours trying to figure out why a script works on one server but not another, almost inevitably because a switch in a configuration file is set the wrong way. I suppose some people find this to be a fun challenge, but not me. So I decided to take time off from writing software, and spent the past couple of years working primarily on product design.
Recently, I decided to evaluate Google's App Engine environment, in large part because Python is its native language. I have been a long-time Python fan, but as I pointed out in my original article, it's database support (an essential element for web programming) put it at a disadvantage compared to other languages like PHP until fairly recently. I was impressed, and in fact, flat out stunned at how easy it was to work with Google's development environment. In my original piece I laid out a short list of demands, things I'd want to see in a programming language:
- I want a standard library that supports the features that most applications use. If an app needs external libraries, it should load them dynamically and transparently to the user, no complicated include or build process.
- I want to be able to distribute apps, using only the standard library, and not have to distribute all of the underlying libraries (for example, "just copy widget.pyc into /widgets and run it. Have fun!").
- I want a human readable language. If it's full of extraneous formatting characters, I can't read it and will spend my time searching for missing brackets.
- If performance matters, I want the option to compile to a native executable or talk at runtime with an external binary. No build commands.
- I don't want to learn yet another language. That's fun when you have all the free time in the world, but later in one's career, this time is scarce. I'd rather spend my limited free time these days learning new human languages.
- Lastly, I want a language that is baked into computers, much as I can go right to Python in Mac OS. I don't want to install something like .Net from a DVD."
Google App Engine addresses 5 out of 6 of these items fairly elegantly (you obviously can't run your own C code on Google's servers, but performance is less of an issue if you can run your apps in Google's grid computing environment). The base environment provides most of the standard libraries, plus well designed APIs for data storage and retrieval, memcache, external web services, and a few other things that work out of the box. I expect they'll add more features, such as cron jobs, in the future. I was able to use third party libraries, such as Universal Feed Reader, without any drama. So on point #1, GAE lives up to Python's "batteries included" motto, or as Mark Jackson said "Why settle for snake oil when you can have the whole snake."
On point #2, deployment, App Engine really could not be any easier. You debug your application in your localhost enviroment (Google provides a self-contained webserver that simulates the behavior of the grid server). When you're satisfied that everything looks OK, you press the DEPLOY button and it syncs everything with the Google servers. There's no need to configure CPUs, manage Apache, or any of that. In this respect, App Engine is a big step ahead. Services like Amazon Web Services, as advanced as they are, still require you to provision and manage virtual servers, which still need to be configured to run your applications. If you're running a typical web app, this can be a lot of extra work, whereas on App Engine, everything works out of the box.
Python already addresses point #3. Even non-programmers can follow well written Python code as its closer to BASIC in form than C++. One of my gripes with C-like languages is that I have a hard time seeing punctuation marks, so I'd spend all of my time hunting for a misplaced semi-colon. Python relies on indentation, which also makes it easier to read, at least for me. It also makes many tasks, such as iterating through a list of objects, as easy as "for item in objects:".
On point #4, Google is promising an essentially infinite computing resource once they open the system up for metered rate billing, which will be similar in cost to other services like Amazon Web Services. What's even better is that you don't need to think about how many CPUs to provision. The App Engine environment will handle all of that for you. I am sure there will be glitches, as there is with any first generation product, but Google has recruited some of the best computer scientists in the world. I trust them to figure things out. The only point where App Engine does not meet my list of demands is the ability to run specialized code, long running processes, etc. That's OK, if I really need to do that, I can set up a separate server somewhere else to do that.
Points 5 and 6, also taken care of. I can use a language I already know and like, and since I use a Mac, Python is baked right in. No need to go install a multi gigabyte package (cough .Net).
I put App Engine to the test by migrating my current project, the Worldwide Lexicon. I initially planned to build a simple demo app and leave it that, but we were about to start work building our collaborative translation memory on Amazon Web Services. I like AWS, so this is not a critique, as I think both platforms are compelling, but each is designed to do different things. I started this project almost on a lark, to see how far I would get. I was surprised to discover that it took me less than a month to rebuild the system essentially from scratch. WWL is well suited to App Engine since it is, first and foremost, a database application that stores and indexes and large number of relatively short texts.
We now have an open translation memory that is accessible as a web service. With it you can fetch and submit translations for texts, submit texts to be translated by other users, submit and fetch scores for translations, and manage users. WWL, in its new form, is essentially a language/translation API that can be incorporated into any website, AJAX widgets, you name it. If you'd like to have a closer look, the web services and documentation are at worldwidelexicon.appspot.com for now (if you're a developer and are interested in language, translation, etc, we'd love to hear from you).
App Engine was interesting both because it provides web developers with an environment that works out of the box, but also because it makes scaling your application easier. It's a different environment, but I did not find that to be a deterrent, in part because I had been on hiatus, and was open to doing things differently. Because I had been doing other things for a while, I approached this as an opportunity to relearn from scratch, so except for a couple of parsers, I didn't use any tools or libraries beyond the base environment. If you're planning on using a set of existing tools, my guess is that you might run into problems because App Engine disables some libraries (such as for socket access). I can see where that will cause problems with other packages that may have calls to forbidden services hidden several layers down. This is a pretty important caveat. If you're new to Python, or have been using it primarily for other types of software, I think you'll find App Engine very easy to work with. If you're hoping to port an existing web app that's dependent on other tools, it might be a challenge. Personally I am of the view that it's good to take a clean slate approach when working with a new system like this (and even to rebuild older apps from scratch periodically, so you're forced to re-assess what you did years before).
App Engine, as good as it is, is lacking in some key areas. If you're building a typical web application, none of these are deal breakers as you can fill these gaps by hosting these apps on other services. I expect that most or all of these items will be addressed in some form as Google expands the platform. The key things that are missing are:
- Cron jobs : there isn't currently a way to run scheduled processes, or for a script to schedule a process to run later. This is important for things like housekeeping and background tasks. It would also be nice to have something like Amazon's Simple Queue Service. The App Engine team tells me cron will be available sometime within six months. In the mean time, there is a simple cheat. Write some scripts that live on another PC that wake up and ping URLs on your App Engine server at the appointed time (for example, to run a daily report script at 12am).
- File system : App Engine does not provide a bulk storage solution similar to S3, however, the App Engine team has this on their product roadmap for 2009. For now, you can work around this by writing a lightweight S3 app as a web service for large file and object storage. Another work around is to divide an object into chunks to be stored as blobs within the datastore (similar to sharding).
- Long running processes : App Engine does not provide a mechanism for launching long running processes, and in fact it kills processes that run longer than a few seconds. This makes sense because it is a shared tenant environment and they don't want someone starting a batch of video encoding processes that will make load balancing a headache. Not sure if they plan to deal with this or not. Again, this is not that big of a deal since you can build separate boxes to run this sort of software (and if you're doing something that resource intensive, like transcoding, you probably want to do that using a specialized process like FFMPEG anyway).
- Better DNS integration : if you only need to map a static domain (e.g. www.yoursite.com) App Engine works pretty well, but if you have a complex namespace (e.g. something.else.yoursite.com), it won't accomodate this. Hopefully they'll allow you to map a static IP address to your app. Then you can set up your DNS namespace however you like. Not sure where they stand on this, but they seem to be following AWS closely, so I'd assume it's on the roadmap somewhere.
- Capacity : App Engine currently limits system capacity during the beta period, but they are preparing to offer metered rate billing similar to their Ad Words model. The App Engine team told me that billing will be rolled out in the near future, so this should be a non-issue shortly.
Update: The App Engine team alerted me to several items which are being added to the six month product roadmap as this is going to press.
- Scheduled Tasks (Cron) : an API to schedule tasks is slated for the six month roadmap. This will enable applications to schedule scripts to run at a specific time or timed interval.
- Incoming Email Queue : this interface will translate incoming email messages into CGI calls that can be processed by App Engine scripts. This will enable you to build handy admin tools where you can email commands from your mobile device, as well as build email processing applications.
- Dynamic Background Task Queue : this API will enable users to schedule and manage background tasks. It's unclear how this will be implemented, and whether this will include long running tasks, but it is distinct from the cron tool.
- XMPP API : one of the most interesting pieces of information they shared was news about an XMPP API that will enable App Engine processes to interact with and control external XMPP (Jabber) services. This will enable you to create real-time communication applications, instant messaging apps at the very least.
- Quotas will be lifted shortly, as the option to pay for increased bandwidth, CPU, etc is imminent.
The bottom line: App Engine is one of the best designed development environments I have seen. More importantly, it was designed from its foundation for web programming, with an emphasis on scalability and ease of deployment. While some people have critiqued it for lacking certain things like a local file system, it makes sense in the context of how Google's infrastructure works, and what they need to do to make things scale. The adjustments you need to make in building apps for the environment are pretty minor, so overall, I was very impressed with it. I would compare it, in many ways to Visual Basic. It's easy to throw mud all over VB, but when it came out, it was one of the first tools that made developing Windows applications relatively easy. The language itself was pretty ugly, at least until they updated it with .net, but you could build useful things that worked. App Engine runs on a much better language, but like VB, it provides developers with a ready made set of tools and services so they can focus on solving their problems, versus figuring out why PHP can see Postgres on Server A but not on Server B.
One important criticism about App Engine is that it is a proprietary solution, so anything you build there will be locked in there. That's a valid concern, but I don't think Google is going away anytime soon. Developers could have said the same thing about Visual Studio, and did, but the tools offered enough value that millions of developers used them over other options. In the short term, it's true that you'll be locked in, but there are strategies you can use to mitigate this risk. In the long run, I expect that competitors such as Amazon will develop similar services, and will provide libraries that mimic the behavior of Google's libraries closely enough that porting will not be too much of a headache. For example, Amazon could build a Python library that talks to Simple DB, and presents itself to your app in the same manner as Google's db.Model system. It might not map one to one, but it would be close enough that you could migrate if you needed to, especially if you were careful about the types of data stores you created. (I am hedging my bets by designing my datastore so it is structured the same as a conventional SQL database, so if I decide to move to another environment, everything is in conventional row/column tables.) The point is that App Engine is likely to become a blueprint that other service providers will follow, so there should be some informal standards before too long. But yes, if you dive in today, you're throwing your lot in with Google, which really is not such a bad place to be.
How does App Engine compare to Amazon Web Services? The major difference between App Engine and AWS is that AWS is designed around virtual machines, so you still need to provision and configure CPUs, although you can do this on demand to respond to variations in demand. This gives you a great deal of flexibility, since you can, for example, light up a hundred virtual servers to perform a nightly task that is compute intensive, and then shut them down. The underlying concept, computing on demand, is similar, but in AWS, you must still think in terms of servers and the software running on them, as well as proactively manage them, whereas App Engine is simply a CGI processing service with no notion of a persistent server. If you're building a data centric web app, and don't need to worry about running lots of background processes, App Engine is compelling. If you need to do things like transcoding, or a long running statistical analysis, you're probably better off with AWS, or some combination of the two (you can have background apps at AWS pull data out of App Engine, process it, and send it back). So I don't see it as an either or thing. In our case, I expect we'll have some services running at Google, and others, such as presentation layer services, running elsewhere whether it is Amazon or our existing rent-a-server facility.
In a way the deficiencies of each product is the opposite of the other. Amazon has a very well thought out suite of component services (S3/storage, SQS/messaging, Simple DB/data store, and ECS/elastic CPU), but lacks the simple runtime environment Google provides. Conversely, Google provides a very simple and intuitive CGI service, but lacks some of the services Amazon offers, so it makes sense that both products will converge over the next year or so to become functionally similar.
Lastly, on a closing note, I would like to thank Guido van Rossum, the inventor of Python, and now part of the App Engine team, for his work on both projects. It's nice to see good work rewarded instead of thrown under a bus.
Example : Building A Web Analytics Tracker
This example illustrates how easy it is to build a useful web service on App Engine. This sample program tracks incoming visitors, geolocates them, determines what languages they speak, and logs this information in the data store. The program combines several of the App Engine libraries, including: datastore (database), memcache (for fast in memory counters, etc), external URL access (URL Fetch), and the standard WSAPI library. The program captures the user's IP address, uses a geolocation service to resolve the user's country, examines browser language preferences, and then increments an in memory (memcache) counter before redirecting to an image file. This can be used as a counter/tracker by simply embedding an image tag in a website that loads
from google.appengine.ext import webapp
from google.appengine.ext.webapp.util import run_wsgi_app
from google.appengine.ext import db # import the Google data store API
from google.appengine.api import urlfetch # import the URL Fetch API, for accessing external URLs and web services
from google.appengine.api import memcache # import the memcache API
# this is a crude little parser that grabs the ISO language codes out of the Accept-Language header sent by the user's browser
for i in languages:
# define a simple data store to log incoming queries by IP address, country, and the user's primary language
timestamp = db.DateTimeProperty(auto_now_add = True)
remote_addr = db.StringProperty()
country = db.StringProperty()
language = db.StringProperty()
# create a request handler that processes queries, identifies the user by IP and language, logs query, and redirects to a logo image (/images/logo.png)
# grab the user's IP address and language header from HTTP headers
# parse the Accept-Language header to build a list of preferred languages
languages = parseLanguages(languageheader)
# ping the maxmind geolocation service, it will return a two letter country code (you need an API key, other services they offer resolve location to the city level)
url = 'http://geoip1.maxmind.com/a?l=apikey&i=' + str(self.request.remote_addr)
result = urlfetch.fetch(url)
if result.status_code == 200:
country = result.content
country = ''
# create a log object using the Log datastore model, populate fields, and then save it
log = Log()
log.remote_addr = remote_addr
log.country = country
if len(languages) > 0: log.language=languages
# create or increment a memcache counter named 'counter' to track total number of hits
if memcache.get('counter') is None:
# redirect, user's browser loads /images/logo.png
application = webapp.WSGIApplication([('/logo',Tracker)],
if __name__ == "__main__":