Game Audio In The Cloud - Part 3 (Conclusion)

By Peter Drescher
June 3, 2010

At the Game Developer's Conference in March, 2010, Google (in a rather brilliant marketing move) gave Android phones to all attendees of the Mobile Game Summit, and to every speaker at the conference (gee, think they're interested in taking a bite out of Apple's mobile gaming market?) ... and yes, I got one, and yes, it rocks!

In the previous blog in this series, I discussed the Nexus One's Cloud-based voice recognition service, which is so good, it practically makes up for the lack of a physical keyboard. That's fairly impressive Star Trek technology right there, but as usual, it's the downloadable apps section where the real fun is happening. Given Google's open development platform (as opposed to the iPhone's walled garden), it's no surprise that many interesting and innovative apps are available, and given Google's own brilliant engineering staff, many of them are from Google itself.

Easily my favorite so far: Google Sky Map. I've been an astronomy buff since childhood, and was excited when, a few years ago, Google Earth let me "turn the camera around" to look up at the night sky from my desktop computer. The ability to zoom in on galaxies and nebulae was sweet, and Google Mars was just the coolest thing since flavored toothpaste ... until I discovered Google Street View on the freakin' Moon (generated from Apollo lunar landing images).

Now Android lets me take Sky Maps off my desktop and bring it outside, where I can hold it up to the night sky to identify planets and constellations in real time, in augmented reality (I can even point it at the ground to locate stars in the Southern Hemisphere, as if seeing through 8000 miles of solid rock). Now THAT is mind-bogglingly cool! Most importantly, it aptly illustrates a qualitative advance in features and usability, enabled by the leap from wired desktop to wireless mobile.

Mobile Games

In recent years, we've seen similar desktop-to-mobile jumps in gaming, but unfortunately, with the opposite effect. Xbox and PlayStation games are huge, immersive, and LOUD, with multimillion dollar production budgets and marketing campaigns, whereas iPhone games tend towards the other end of the spectrum: small, cheap, and developed in back bedrooms, or by license-holders trying to squeeze a few more bucks out of their franchises. Mobile games are "casual", meaning "designed to kill time while you're waiting for something else to happen in your life", not "an occupation in itself, with skillz to learn, and achievements to brag about".

Nobody is going to argue that "God of War" on a handheld PSP is better than the console version, only that it's more portable. From an audio point of view, earth-shattering explosions on headphones will always pale by comparison to the thump of .1 subwoofers (and let's not even talk about fingernail-sized cellphone speakers!) As far as I can tell, NO port of a desktop / console game to any mobile platform has yet made the kind of quantum leap in functionality and sophistication that Google Sky Map does.

But soon enough, all that's gonna change ...

Cloud-based Games

Now, I am not a game designer, and this is not a game design blog; we talk about interactive audio (with an emphasis on mobile) here, so I'll just let you imagine what playing an augmented reality game, running on a Cloud-based server, using GPS location, would be like. It could be as simple as BattleZone in your backyard, as complex as Call of Duty featuring a squadron of players running around downtown San Francisco armed with lazer-guns, or even a World of Warcraft amusement park, complete with castles, dungeons, and gigantic (CGI) fire-breathing dragons.



However you wanna play it, the basic concept is simple. Instead of a 3D environment generated and displayed on your TV screen, the playing field is your physical environment, with game levels, characters, and obstacles displayed as an overlay, using Augmented Reality Goggles -- glasses that not only let you see the real world, but also the display the computer-generated game world on top of that, like a heads-up display attached to your head. The goggles are directionally aware (so they know where you're looking), and Cloud connected (so the generated environment overlay can be displayed on the lenses). Buildings become force-field barriers, players have game stats floating above their heads, spaceships zoom by shooting disruptor beams at you.

Obviously, these kinds of games will not be possible until ultra-wideband wireless Cloud-access is fairly ubiquitous (though I suspect paid-entrance VR arenas, with RFID localizers, and organized teams, could pave the way). These are not your father's casual games, played on hand-held devices, i.e. Tetris on a cell phone. Augmented reality games will be the equivalent of the Sky Map qualitative leap from desktop to real world, as applied to mobile gaming.

And so the real question in my mind is: What will Cloud-based games sound like?

Audio Generated In The Cloud

Currently, mobile game soundtracks tend to be painfully archaic. Case and Point: At GDC, there was a well-attended "industry veterans talk about the early days" session that described techniques used to produce game audio 30 years ago ... techniques that are still used for cell phone games today!

The reason for this is the same as it ever was: bandwidth limitations. Whether it's restricted memory storage on the device, slow CPU speeds, or "thin straw" data networks, mobile audio has always been constrained by one bottleneck or another. Soundtracks are comprised of MIDI files playing byte-sized instruments from tiny onboard soundbanks; or low-resolution highly-compressed WAV file sound effects; or any of a hundred different methods for generating interactive audio in the palm of your hand ... but always with the same result: "sounds bad" -- particularly when compared to the full-resolution streaming audio produced by the media player on the same device!

And therein lies the path to sonic salvation: In my "Myth of Music Ownership" blog, I described a future Cloud-covered world where music is no longer a "thing that you own" (i.e. a downloaded file stored on your disc), but rather a "service you subscribe to" (i.e. on-demand audio streamed off a Cloud server). This becomes possible, convenient, and desirable (I'd say, even inevitable) when your mobile device is always connected to the Cloud via high-bandwidth network, sending up as much data as it downloads ...

Now let's apply that same technology to mobile game soundtracks -- Gameplay parameters are sent up to the server (i.e. pushed this button, blew up that spaceship), the application running on the Cloud server mixes the appropriate beeps and booms into the audio output buffer, which then streams the game soundtrack to your device. The data being transmitted up is rather small (just a few bytes describing the actions being performed), the server has all the CPU power, memory storage, and data bandwidth you could ask for, and the download stream is like listening to a digital radio station.

(Now, I can hear some of you out there exclaiming "Hold on there a second, Mr. Annoying Audio guy! What about lag!? What about network congestion!? What about when you're on the subway!?" ... to which I respond, "Gentle Reader, please suspend your disbelief for a moment while I make my case, because the Cloud technology I am speculating about does not yet exist ... but there's no reason why it can't, based on current trends and the laws of physics.")

3D Audio Transmitted From The Cloud

One of the most compelling aspects of the leap from TV to augmented reality is the way Global Positioning System satellites let computers know where users are in the real world, and at what speed they are moving, and in which direction. People using GPS navigation in their cars take this for granted, but the directional aspect of the technology will have a profound effect on mobile game audio.

The human auditory system determines where a sound is emanating from using Head-Related Transfer Functions (HRTF). Basically, the brain detects and interprets minute differences in the filtering and phase of sounds received by the stereo receptors on your head (aka "ears"). By comparing the variances, you can tell whether the sound came from in front of you, behind, above, or "someplace over there". Humans are really good at this kind of thing, because those who could hear where the wolf was lurking lived longer, and thus bred more, than those that couldn't (evolution, ain't it cool).

Detecting "3D" sound is also important for gameplay, and is a major impetus for gamers to spend extra cash on surround speaker systems. Being able to identify where the Nazi sniper is shooting from will help you live longer than your stereo-only brethren. These days, increasingly sophisticated HRTF algorithms can fool the ear/brain into sensing sounds in a full 360° field using just two earbuds (granted, mostly horizontal ... up and down is still kinda tricky).

So what happens when I take that technology outside? Imagine playing "Pleistocene Planet" in Central Park, my team is hunting wooly mammoths, and there are other predators around besides man. GPS knows I'm looking north-by-north-west at that herd by the lake, and so when the CGI sabertooth comes running at me from the east, the game audio engine manipulates the HRTF in my headphones to make the aggressive roar and galloping paws sound like they're coming from over my right shoulder ... and when I turn to take aim with my VR-lazer-rifle, the direction of the sounds matches the movement of my head, always coming from the east (and getting closer!) At the same time, and in contrast to the generated-environment sounds, the background "danger" music maintains a constant stereo placement, becoming more intense as the tiger's enormous fangs draw nearer.

AR-Goggles With Built-In Earpods

Obviously, it will be some time before this kind of game becomes popular enough to cause traffic jams on West 97th, but nothing in my description of pleistocene gameplay is impossible to achieve. In fact, some of it is already commonplace technology, available on every console game for sale at Gamestop. The biggest leap, made possible by GPS, is taking the game off the TV, and bringing it outdoors.

Of course, once you're out there, you don't have that big HDTV monitor ... and tiny cellphone screens just won't cut the gig (neither will tiny cellphone speakers!) Since you won't be carrying around 5.1 speaker systems with you either, your AR-goggles will have GPS, accelerometers, network connections, and earpods built-in -- basically all the functionality of current cellphone technology, miniaturized down to the size of stereo earbuds, with HRTF software, microphones for communication and recording, and Internet access ... all standard (they even make phone calls).

This is not so far-fetched as it might seem. It's really only an extrapolation of current mobile device capabilities, particularly as the hardware gets smaller and more powerful. The Zoolander phone may be an excellent sight-gag, but when cellphones actually do get that small, you won't carry them in your pocket anymore; you'll simply wear them, like contact lenses for your ears. Also note: AR-goggles/earpod headsets require no keyboard for controls ... that's all voice-activated, with vocal recognition and interpretation handled in the Cloud (like Android voice-search does today). Of course, menu items, visual confirmation of selections, messages and alerts, are all displayed on the dual-video screens of the AR-goggles.

The Cloud as Game Console

One of the reasons AR-goggle/earpod headsets will be as inexpensive and commonplace as cellphones is because, as a Cloud device, they don't actually have to be all that powerful. They just have to transmit GPS and game control parameters, while receiving stereo audio/video streams from the Cloud. All the data intensive processing, environment generating, multiplayer tracking, and 3D soundtrack mixing, is performed at high-speed on the vast network of outrageously powerful Cloud servers.

Again, this is not that different from current console technology, except that the wires transmitting video to your TV, and audio to your speakers, are replaced by a wireless network streaming images and sounds to your headset. Voice-commands, directional sensing, "trigger pulled" messages, go up to the Cloud, and are equivalent to the signals your console receives from your game controller (many of which are wireless already).


SO, to summarize: the Cloud is your Xbox, AR-goggles are your TV screen, earpods with HRTF capabilities are your surround speaker system, the Wii-style ray-gun you're aiming is the "fire!" button on your controller, and the wires are simply gone. In this way, mobile gaming will come close to delivering that ultimate goal of console gaming ... immersion.

Social Networking and OnLive

One of the reasons I feel confident making these apparently audacious predictions about the future of augmented reality gaming, is that most of them are just extrapolations of features available on consoles today. MMO and multiplayer systems provide players with a shared generated environment, and coordinated gameplay can be just as much of a social network as anything Facebook does.

Now imagine your online team running around on a football field, wearing AR-goggles, dodging bullets, firing plastic rifles from behind virtual concrete bunkers. Some of your squad is right next to you, but some are on the other side of the planet ... yet you all share the same audio/video playing field experience, like any MMO -- except you get more exercise (or not ... I'm sure some players will prefer to sit in their mom's basement and control remote AR characters. When you're on a playing field in Chicago, armchair avatars dialing in from Des Moines will be practically indistinguishable from your AR squadmates in Brazil and Japan).

And finally, there's our friends at OnLive -- a service that acts almost exactly like the wireless Cloud-based game network pictured above ... except with the wires (and without the AR-goggles). You plug a controller into a high-speed cable Internet connection, control data goes up, the game runs on banks of powerful servers, and audio/video streams down to your TV screen / speaker system. The service will launch any day now, and promises to be a (pardon the pun) game changer for the industry ...

But if the past few years have taught us anything, it's that:
a) anything that can be done with wires can be done better without them, and
b) more mobile = mo' betta'!

SO if OnLive is a success, it seems only a matter of time (and wireless bandwidth) before that kind of system boldly goes mobile into the (augmented) real (GPS mapped, Cloud covered) world. Won't THAT be mind-bogglingly cool!?

   - pdx

(OK, now re-engage your critical belief systems, and comment on why I'm SO wrong! :)


You might also be interested in:

News Topics

Recommended for You

Got a Question?