Analysis 2009: The Web Services Era Begins in Earnest

By Kurt Cagle
January 6, 2009

(Warning, this gets technical).

This may seem a rather odd statement - after all, "web services" in the traditional SOA sense have been around for the last decade, give or take a few years. I believe, however, that while there have been some real (and important) success stories in that period, for the most part we're only just beginning to understand what distributed services is all about.

In the desktop application era, the dominant theme was efficient coupling of interfaces. As an application developer you focused on performance, on building as much of the model directly into the application as possible, and on providing as many features as you could manage into the application, in order to make it appeal to the broadest potential market. For desktop applications, these are still appropriate considerations.

The web services era, or perhaps more accurately, the distributed services era that we are now entering into, has considerably different priorities of focus. In most distributed services, it is likely that different nodes within the network may very well utilize different vendors or open source tools in order to perform the processing on their end. This in turn places a much higher premium on interchangeability of information, which means that simple messaging and conformity of interfaces becomes more paramount than robust feature sets. (It also makes it harder, and generally less desirable, for one vendor to provide "all-in-one" solutions.

Desktop applications were typically intermittently asynchronous - user interface components typically had to be asynchronous and event driven, but most of the underlying logic was still built assuming a high degree of synchronization of libraries. In a distributed application, asynchronicity - a lack of knowledge about when a given packet of information would arrive to be processed - is the rule rather than the exception. This means that the dominant form of operation in such systems are message queues that process information independent of when that information arrives.

Desktop applications typically baked the logic for the application within the binary code for that application, in order to facilitate the most efficient turnover of content. Distributed applications, on the other hand, usually work far better by treating the business rules for the application as a data streams themselves, streams that can be changed depending upon other factors in the system. Such switchable logic isn't as efficient in terms of processing speed, but it is usually far better when dealing with a complex business environment.

Desktop applications are usually built assuming that state changes exist on a finite set of atomic properties, and the role of document serialization is simply to freeze those properties then thaw them out at some later point. Distributed applications, on the other hand, actually work upon the assumption that documents themselves are the "atomic" properties of the system, and that the role of the messaging system is to move various and sundry documents between processors, document producers and document consumers.

This is a subtle shift, and represents a different level of programming that is closer in spirit to systems architecture. It is also, generally a complementary relationship to more traditional programming, as the processors themselves still need to be written by traditional programmers, but these processors become increasingly agnostic about the business logic that runs over them. This holds true whether that processor is a web browser, a router, a transformation engine or data filter.

Notice in all of this that I haven't mentioned XML once. This was deliberate. My sense is that what seems to be emerging on the web now is a paradigm of macro-document formats and micro-document formats. Macro-documents are typically narrative and sequential in structure, have contextual identity (a document can be uniquely identified by some form of "name") and are usually fairly deep. Micro-documents, on the other hand, are usually unordered or semi-ordered bags of properties (each property of which is uniquely named within its siblings), typically do not have a unique identity, and usually employ a combination of hash tables and arrays to store content.

Whether that combination is XML and JSON or some other set of formats is of only peripheral interest, although it is my suspicion that both are sufficiently well entrenched at this stage that no other formats are likely to get traction for the next decade anyway (I think when the semantic web reaches its own level of maturity and mass adoption that we'll see something else emerge, but this will be at a different level of abstraction).

In 2008, I saw a number of different people come up with the same realization at the same time that syndication, which employs the concept of REST at a very profound level, was actually a remarkably good carrier of more information than simply blog posts. About the same time, the final version of XQuery began to appear within both purely XML and XML/SQL hybrid databases, along with effort finally being made to address the issue of an update standard for XML content (the XQuery Update Facility, or XQUF - okay, the name isn't exactly euphonious).

What's more, people have begun to combine such XQuery content, whether serving up Plain Old XML (POX) or syndicated content via Atom or RSS or messaged content via SOAP, with an XML enabled front end client - XForms, obviously, but many other tools are also moving into the mode of an XML model driving a writable user interface. My friend and colleague Dan McCreary termed the term XRX - Xquery/REST/XForms for this assemblage, and overall I think this seems to be the meme that's actually sticking in discussion.

However, the idea here is that this approach, which uses the concept of XQuery collections - either in a database, a file, a syndication stream or some other resource - as vehicles for passing both content and links to content to an XML-enabled editor that can then roundtrip these documents back to the server, is something that is intrinsic to the web, something that is already having a huge impact on the way people write applications. Overall, the concept is one that seems to be gaining the name RESTful Services, thanks in great part to a book written by Sam Ruby and Leonard Richardson by that name.

In many respects the concept isn't new - after all, much of AJAX employs this model, one in which components use the XMLHttpRequest object to change the state of the components within a web page - but XML RESTful services in particular work at a somewhat higher architectural level, especially since XML (and HTML, the almost XML) provide the substrate on which most JavaScript services operate.

I think that RESTful services will be a major area of focus, especially toward the latter half of 2009, in the industry. As a concept, it sees the web as being very much like a database, and this approach I think is resonating with people in a way that SOAP based remote procedure calls, in its attempt to turn the Web into a sea of COM objects, failed to do outside of very sandboxed environments

I don't see SOAP going away, however. It is a messaging format, and one designed specifically for high reliability, high security transport across a wide range of protocols, though I sometimes wonder if in the attempt to make it work across protocols the designers of SOAP didn't make a mistake - trying to build the ultimately conformable box when all that was needed in most circumstances was an envelope.

I see Atom emerging as the RESTful counterpart to SOAP, at least over the HTTP protocol.

On the other hand, I think that a format to watch carefully over the next year is the Extensible Messaging and Presence Protocol (XMPP). XMPP was designed as the message transport protocol for the Jabber project. XMPP is actually a number of different messaging formats as well as a communications protocol, and the idea behind it is that it is intended to facilitate connected synchronous communication, rather than the stateless asynchronous communication that's a hallmark of HTTP over TCP/IP. Because of the way that this protocol is set up, XMPP messages can generally be transferred far more efficiently than either SOAP or syndicated Atom. Not surprisingly, organizations that need real-time messaging systems are looking at XMPP very closely, especially since it can, in theory even be used as a transport protocol over which Atom or SOAP can ride.

I look to see XMPP becoming a critical part of the web developer's toolkit by the end of 2009, as he or she becomes more comfortable with generalized messaging architectures.

You might also be interested in:

News Topics

Recommended for You

Got a Question?