Game Audio In The Cloud

Generated in the Cloud and streamed to your mobile headset

By Peter Drescher
March 26, 2010 | Comments: 1

At the recent Game Developer's Conference in San Francisco, for the panel entitled "After The iPhone ... What?", I was introduced by the panel chair, Jeff Essex of Audiosyncrasy, as follows:

"When I asked Mr. Drescher to speak on this panel, he was Audio Director for Microsoft's Entertainment eXperience Group, a design resource for the Entertainment and Devices division, which includes Xbox, Windows Mobile, Zune, and other products. He was assimilated into that position after the Microsoft acquisition of Danger, Incorporated, where he was Principal Sound Designer for the T-Mobile Sidekick, producing ringtones, system sounds, and game audio for all versions of that device. However, apparently, the implants are removable, and he is currently working as an independent contractor out of his Twittering Machine project studio, and writing the Annoying Audio blog on interactive and mobile audio topics for O'Reilly.com."

Here's what I said:
---------------------------------------------------------------------------------------

Past

In 2002, at the International CES trade show in Las Vegas, Nevada, Mark "the Red" Harlan, then Chief Evangelist for a scrappy little start-up called Danger, Incorporated, demonstrated an early version of a wireless internet device called the "hiptop" (later known as the T-Mobile Sidekick). He explained that it was a prototype, costing many thousands of dollars to produce, then he navigated to the Notes application, typed in a message, hit enter, and waited a moment while the Note synced to the Danger servers via wireless connection.

Then he put the device on the floor, and dropped a bowling ball on it!

BAM.jpg
BAM!! Bits of plastic and electronic components everywhere! While the audience reeled and laughed, Mark fished the SIM card out of the wreckage, plugged it into another hiptop prototype, booted it up, navigated to the Notes app ... and the message was still there! (Cue applause from an astonished audience!)

Now, at the time, we didn't realize that this was actually a demonstration of the power of Cloud technology, where your data lives on the network, not in your pocket. We just thought it was a fun way to show off how our little device worked. The hiptop was what is called a "thin client", meaning the device itself was quite low-powered, with very little memory and a slow CPU. All of the cool stuff it could do, like surf the web and email pictures, was made possible by banks of powerful servers that did the heavy lifting, and transcoded the data for fast, efficient, delivery to the device.

The business plan was also revolutionary. Danger made its money by managing the data, not selling hardware. In fact, we lost money on every unit we sold, but made it back, and more, by getting a cut of the subscription fees charged by the carriers.

As new versions of the Sidekick were developed, various bottlenecks expanded. The GPRS network was first enhanced by EDGE technology, and current devices support 3G speeds. Storage capabilities increased dramatically as well. The first device shipped with only 16Mb of RAM, whereas current versions feature 256Mb internal flash, 128Mb RAM and a slot for a MicroSD card that goes up to 8Gb.

But one thing that never expanded much was the memory allocation for audio. Late model Sidekicks use just over one megabyte to produce all audio built into the OS, and that includes 40 UI sounds, about 75 different ringtones and alerts, plus the bootup animation soundtrack.

Ringtone size is limited to 100k max, and only 10 ringtones downloaded from the Catalog can be stored on the device at any one time. Game memory constraints are even tougher. Until recently, the maximum downloadable application size was only 320k, leaving very little room for game soundtracks, which usually consisted of simple MIDI files, plus a couple of low resolution sound effects ... when they had any soundtracks at all.

Present

On February 11, 2008, I woke up to discover I was a Microsoft employee. The Corporation had acquired our little startup for half a billion dollars, and now the Sidekick is a Microsoft product, alongside of, and competing with, the Windows Mobile platform. Comparing the two systems can be educational.

Alan Kay wrote "People who are serious about software build their own hardware", and therein lies the most basic difference between the Sidekick, whose physical elements were produced by the same people writing the software, and the Windows Mobile operating system, which is designed to run on over four hundred different models of cell phone, each with its own form factor, audio capabilities, and speaker configurations.

There are some real advantages to producing mobile audio when you know the strengths and limitations of your target system. You can tweak your game soundtrack to be loud and clear, using all the tricks of the trade, because you can test your sounds in context, in real time, on the actual speaker, in the actual enclosure, using the correct electronics, and the latest versions of the firmware and the audio engine. Each of those items will color, constrain, or distort your audio output to one degree or another.

Producing audio for a WinMo phone can be a trickier proposition, because you have NO idea how your sounds will be modified by all the twists and turns of the pathway from sample data to vibrating air molecules. Therefore, rather than pushing the envelope, as I would try to do for Sidekick games, you have to take a more middle-of-the-road approach, creating audio that you think/hope/pray will mostly sound good on most cell phones, most of the time.

You'll want to use the Microsoft standard supported file types, like low resolution WAVE files for game sound effects, and compressed WMA files for streaming audio like ringtones. While other standard formats like AIFF and MP3 are supported, they can sometimes require additional software or compression codecs be installed. Certainly, you will want to avoid using compound proprietary formats like RMF, or its open equivalent XMF. These formats combine sample and MIDI data into one container, and allow for sophisticated interactive audio functionality, but also require installation of the Beatnik Audio Engine. Of course, that's how I produced all of the audio on the Sidekick, because a custom version of the Beatnik engine shipped on every device.

Volume control makes for another interesting Sidekick-vs-WinMo comparison. On the Sidekick, there are numerous levels of gain control: note velocity, volume controllers in both the MIDI data and the instrument definitions, levels set in the Javacode resource files, system wide volume levels by notification type, gain / EQ settings in the firmware, etc. That kind of flexibility is both a blessing and a curse -- it allows for detailed control over a wide range of sounds, but also dramatically increases the complexity of volume setting and bug fixing.

WinMo phones, on the other hand, take the opposite approach. ALL sounds produced by the OS are played at 100% volume, every time. If you want your eMail alert to play at half the volume of your default ringtone, you have to bake the volume level into the WAV file, so that it only uses half the available dynamic range. This may cause an increase in perceived noise levels, particularly if the sample is then heavily compressed, but since it's playing on a speaker the size of your fingernail, nobody seems to care.

Of course, Windows Mobile has at least one huge advantage over "boutique" operating systems like the Sidekick: ubiquity. There are literally hundreds of millions of WinMo phones out there, running on networks all over the planet. This not only allows them to be cheap, if not free, it also provides an enormous install base of potential customers. If your game becomes popular with only a tiny fraction of those users, that's a hugely profitable market.

Future

Unfortunately for my friends in the Windows Phone division, that market seems destined to dwindle. In a recent speech at the Waldorf-Astoria in New York City, industry analyst Mark Anderson of the Strategic News Service predicted "Except for gaming, it's 'game over' for Microsoft in the consumer market ... It's time to declare Microsoft a loser in phones" (#8, starting @ 32'32"). There seem to be two main reasons for this:

The first is timing. When David Pogue reviewed the Bing search engine this past July, he wrote "For the last 15 years, Microsoft's business plan seems to have been, 'Wait until somebody else has a hit. Then copy it.'" While this strategy has been fairly successful for many Microsoft products, it does have an inherent weakness -- there is usually about a five year delay between the other guy's hit, and the release of the Microsoft version.

But in the wireless world, five years is a sea change in technology. Five years after the Sidekick was developed, the first iPhone was announced. Five years from now, there will be some new device that makes today's iPhone seem quaint and obsolete. Thus, if Microsoft continues to "copy the other guy, five years later", they will always be behind the curve, and thus always selling outmoded products. That may be a workable business plan for servers and enterprise software, but not for a fast moving, fashion statement, "using last year's cell phone is like wearing last year's designer jeans" kind of consumer market.

The second reason is that the basic concept of a Windows Phone as a powerful handheld computer will become increasingly irrelevant. In the not too distant future, the device in your pocket will derive the majority of its power from "The Cloud", a vast array of high-speed servers, connected to an ultra-wideband radio network. This (currently fictional) technology is the wireless equivalent of today's high-speed Internet, and will have profound implications for business, social networking, gaming, music distribution, and other online activities.

cloudDevices2.jpg A Cloud Device will work more like a Sidekick than a Windows Phone. It will always be connected to the network, and will send up as much data as it downloads. It will be your phone, your web browser, your GPS navigator, your media player, your email account, your game platform, your video camera, your social-network communicator, and more, but instead of carrying around all those gigabytes of data in your pocket, or doing CPU instensive processing in your hand, it will simply access the Cloud for whatever you want, whenever you want it, where ever you happen to be, all for a flat monthly subscription fee.

Cloud-based services will eventually become so mind-bogglingly useful, indispensable, and universal, you simply won't be able to do business without them. As a game platform, they will enable mobile social-networking games that make Mafia Wars seem like Space Invaders. Game audio currently produced by your console or your computer, will simply be generated in the Cloud and streamed to your mobile headset, and to the headsets of the people you're playing with, all over the planet.

And if someone drops a bowling ball on your Cloud device ... no problem! All of your personal data, photos, music playlists, and game scores, are still safe and secure on the servers. Simply log into your account using another thin cheap Cloud device, and continue on with your day. That day is coming, my friends, and Cloud technology is the next logical step after the iPhone business model ... but you'll have to wait AT LEAST another five years before it's available.

Thank you!
---------------------------------------------------------------------------------------

Q&A

After the presentations, there was some discussion of what "game audio in the Cloud" might mean to the other esteemed members of the panel. In response to a question from the audience, I briefly expanded on the idea that a Cloud-based mobile social-networking game would run on the server, take input from multiple devices in many locations, and send real-time game parameters to all connected players. Game audio would be generated and mixed in the Cloud (not on your mobile device), and streamed out via the same kind of futuristic technology used by Cloud-based music distribution systems (I will examine this topic in more detail in a subsequent blog).

Then I handed the mic off to David Sparks, Engineering Manager for Google's Android platform, who got an amused response from the audience by saying, "Well, we are the Cloud!" He felt that while it made sense to process some mobile features on the server, other types of functionality are more appropriately processed by the device, particularly when low latency is required. Jeff Bush (former Danger, former Apple, currently running a games group for Palm) echoed that sentiment, making a distinction between data processed by a Web-based server and then downloaded, and data sent to the device from a Web server to be processed locally.

Now these are both really really smart guys, and I'm certainly not foolish enough to disagree with them. But I do think they were referring to currently shipping products, whereas my dream of the Cloud as a vast distributed computing plexus, ubiquitously and unfailingly accessible at speeds unattainable by today's radio networks; where mobile phones act more like wireless display terminals, not little connected computers -- this technology does not actually exist ... yet. But it seems clear to me this is where the mobile industry is headed (provided the world doesn't go to hell in a handbasket first ... an increasingly unpleasant possibility).

In any case, our panel certainly got people thinking about development issues for mobile devices other than the iPhone, and thus I am calling it "mission: accomplished!" In a final irony, Google, in a rather clever marketing ploy, gave Android phones to all attendees of the Mobile Summit, and every GDC speaker (gee, ya think they're interested in going after the mobile game market!?) ... and so I am now the proud owner of a Nexus One.

And not surprisingly, this thing really is a step in the direction towards a Cloud device (plus it sports some amazing augmented reality functions too, like GPS-based Google Sky Maps ... hold the phone up to the night sky to identify real-time locations of stars and planets ... now that's frakkin' cool!)

   - pdx

You might also be interested in:

1 Comment

Great community and good discussions

News Topics

Recommended for You

Got a Question?