Metaphorical Web and XRX

By Kurt Cagle
September 30, 2008

In October 12, 2003, I made a rather startling realization. I had, for a number of years, been producing one or two books a year in between consulting gigs, had been writing columns for news-stand publications and websites and my career as a writer and consultant was definitely seeming to look up, but I was also reaching a definite burn-out point, due to a lot of different personal factors - my personal life wasn't going well, the economy definitely had cratered in the tech sector, the jobs I was taking on were stultifying and frustrating, and I was angry. Contrary to popular opinion, anger is not in fact all that good for a writer - you write, but what you write usually falls into the kind of political diatribes favored by more radical members of fringe parties.

On that date, it occurred to me that I had not, for more than a year, actually written one thing for technical publication - not a book, not an article, not a web posting, not even helpful suggestions to a bulletin board. I had, as one acquaintance of mine put it at the time, just dropped completely out of sight, to the extent that there were some people who wondered if I had in fact died. I hadn't, of course, but I had been going through one of those existential crises where, if I had continued to feed the anger, I would have ... my stress levels were high enough to send me to the hospital earlier that year with symptoms of a heart attack, and had I not received blood-thinning drugs its likely that I would have had one that would have been much worse, potentially fatally worse.

The mind works in mysterious ways - sometimes the subconscious needs to hit you over the head in order to get the conscious part sit up and take notice of its surroundings. On that day, I sent out my first email newsletter to a group of about 30 people, one focused on technology and business but that also let me say things that I couldn't say in commercial outlets. I strove initially to write a letter a day (I was between assignments at that point), though that slowed after a month to about once every week or so. I used Yahoo initially to host my newsletter, calling it The Metaphorical Web, and eventually I had a couple of thousand people on that list.

Writing the newsletter helped me to jumpstart my career, and by late 2004 I was actually fairly busy. This was something of a mixed blessing - when you're unemployed, writing a letter even once a week is plenty feasible. When you're not, then the same letter can become a fairly onerous burden. I shifted over to blogging around that point, from a suggestion by a friend indicating that it was the wave of the future (and it was, though I've come to realize that there are significant benefits to newsletters as well, which I'll touch on in a bit).

Thanks to M.David Peterson, I also started blogging for O'Reilly a few months after I began the newsletter. While it raised my profile, it also forced me to concentrate less on the personal and more on the technical, which was similarly a mixed blessing. The technical is important - it provides value and lets you help to teach and educate and shape others' thinking, but no writer can be just a machine producing code ... at some point, you need to get those things off your chest that are bothering you in the larger context.

I am currently one of three online editors at O'Reilly, with XML and web technologies as part of my coverage area, though economics, education and sustainability are in there as well. Yet even there, I have to maintain a persona of impartiality, telling the story rather than giving my thoughts, and I've been thinking for a while that I need to move beyond the immediate sphere of O'Reilly and have a place that I can focus my own thoughts on what's going on in the world, as well as establish an imprint that isn't tied to my employer - as important and significant as I feel they are. The result is Metaphorical Web.

Why Metaphorical Web?

The name Metaphorical Web started out as a bit of a pun on "The Semantic Web", but over the years I've really come to realize that the two are fairly distinct concepts. Semantics is of course the study of meaning - how we associate strings of letters together to provide meaningful descriptions of something. It's an odd idea, when you really stop and think about it - for the most part computers have very clear associations between symbols (strings of characters, usually, though that doesn't have to be the case) and actions. A function name, for instance, is a symbol that is associated with a block of code, and when that function is invoked, there's a straightforward replacement with this block of code (potentially compiled down to a binary version of the same thing). The goal of semantics, ultimately, is to establish such relationships between symbolic tokens that inferences about the larger world in which these tokens are defined become more obvious.

The problem, of course, is that human beings are remarkably adept at creating incomplete definitions, shades of meaning, evolving definitions and so on that tend to make a mockery of attempts to create artificial intelligences. At the recent Balisage conference in Montreal, I saw someone who had come up with a view of creating Semantics that looked very much like an I Ching (perhaps an AI Ching?), but even among the semanticists in the room there was a lot of skepticism about it.

I think a part of this is because people are beginning to realize that semantics cannot easily be reduced down to atomic components, but that instead such semantics exists more as some type of quantum wave form where different symbols placed in context create an intrinsic coupling that can't necessarily be predicated upon the characteristics of the components that made this coupling. Indeed, I have begun to wonder whether in fact the one thing that quantum computers may actually be good for (besides solving encryption problems) is providing a physical substrate (and hence a theoretical one) for a reasonable understanding of semantics.

Metaphors, on the other hand, are both easier to understand and more approachable. A metaphor is at its core a mapping between two sets of symbologies.  As a literary device, metaphors and similes tend to become confused, but there are differences - a simile is usually short, highly structured, and embedded within a context "She moved like a cat defending her tail from too many feet." A metaphor, on the other hand, typically is more extended, more loosely structured, and is typically stand-alone. "She was a cat, her tail twitching frantically among the inattentive feet of passer-bys, moving gracefully but with a wariness that bordered on panic."

In the latter case, you are replacing one set of symbolic referents - a woman walking down a street - with another - a cat guarding its tail. In computer systems, metaphors are also known as models, because what you are attempting to do is create a representation of one type of system using the symbolic notation from a different type of system. As with any maps, the metaphors may have a fairly high degree of fidelity, such as a map showing terrain with elevations as a 3d model representing that terrain, but may also have comparatively little obvious correspondance - a map showing only major thoroughfares and schools as a shorthand sketch, for instance.

This process of building models is in fact one of the key roles of a software or system architect, just as it is for any scientist or engineer. In some cases, the model is quasi-functional - you can create basic systems via the model itself - but the functionality of the model is secondary to its use as a way to determine whether the system itself meets the needs of its users.

I also think the idea that you can create a full-blown application of arbitrary complexity from a model is perhaps as unrealistic as trying to model human semantics through mechanical means, perhaps for the same reason: a model by its very nature restricts some subset of the overall system's characteristics in order to better articulate those characteristics that are important, but as you increase the number of characteristics you also increase the complexity of the model (and hence, the potential for "quantum entanglement" and emergent behaviors). I've seen this process happen (and have been guilty of this assumption myself more than once) where there is a confusion between the model and the system that's being modeled, and that confusion can prove deadly for projects.

Yet to me, learning only occurs when we have the necessary referents in our heads to place new information into the context of our existing world view, which is of course itself simply another model of reality, albeit one that we tend to assume is the reality itself. Metaphors are those referents, and as such, the metaphors often become bricks in our sense of reality itself. The file/folder mechanism that has become nearly universal on most computer systems is a metaphor, and ironically its a metaphor that has increasingly lost its inital referents - in this day and age, children who work with computers are far more likely to think of files and folders on their computer desktops as the base entities, and are likely to have to actively work to equate a computer file with a sheaf of paper stuck in a manilla folder.

Thus I see one role of the metaphorical web as the vehicle that challenges the assumptions about what the referents are and how we think about them, and as a consequence, also changing our assumptions about what reality itself is. If our perceptions of reality are in fact made by layering metaphor on top of metaphor, by understanding the nature and characteristics of such metaphor we also gain a much deeper insight into what we consider reality itself.

An Introduction to XRX

So what does this have to do with XQuery and Atom and REST and XSLT and the remaining panoply of three and four letter acronyms that make up this particular space? Quite a bit, actually. XML is a language for creating a model of a given object or system. It's worth noting that, contrary to many people's instinctive beliefs, there is no exclusivity to models - simply because I create a model for describing the traffic flow through a city does not mean that I (or someone else) couldn't also create a model describing crime patterns in a city or where schools are found. Of course, in putting together such models there are undeniable advantages to setting up the models in such a way that you could, theoretically, overlay the crime pattern model and the school locations on the traffic flow patterns.

This is one of those areas where ultimately I believe the relational data model breaks down. One very real problem with SQL in particular is that there is no clean way to separate concerns within a relational data model without resorting to an ad hoc scheme. You can't implicitly build a model in SQL that arbitrarily says that this particular piece of information belongs to the core city model or to the crime overlay on that model - this ends up having to be an assumption made on the part of the modeler, and as a consequence usually ends up with a situation where the model's notation has to enter into each instance (the one use case I can see here is the idea of devoting one field per table of properties that acts as a "name space" identifier - property category, which can take values city, crime, education, etc.

XML does support namespaces. Such namespaces provide a significant separation of concern, making it possible to create data models that have both core components (the underlying city model in the above case) and extension components. XML also provides the intrinsic construction of acyclic directional graphs that establish the relationship between components (primarily those that have strong one-to-one or one-to-many relationships) that many larger or more complex data structures tend to exhibit - structures (or document object models) usually can be resolved into collections of hierarchical trees. It is also possible (though not quite as efficient) to model many-to-many relationships within XML through the use of either internal or external references - cross references to points within the document or references to external data structures available over some kind of protocol web. This protocol can be (and usually is) HTTP, but in point of fact such a reference can be any protocol, and in many cases may not necessarily even be XML content.

The XRX principle in this case is simple enough. It's implicit assumption is that "data" consists of one or more collections of entities which can appear to a data abstraction engine (DAE) as XML. This doesn't mean that the physical data store is XML, only that, from the standpoint of the DAE, what comes out is XML, and any XML that's submitted will be converted via the DAE into the appropriate data format. In essence, the DAE is, well, a metaphor, a mapping from one set of symbols (the internal format used by the data provider) with another (an XML representation of that data). This data abstraction is important, if not critical, because it makes it possible to work with information at the abstraction level of the document, rather than at the representation of the underlying SQL tables or file structures or JSON data feeds. What's more, such a DAE also makes few distinctions (and ideally no distinctions) between information that is maintained locally or information that comes from external data stores, over the wire or through a device.

The benefit to this approach of data abstraction is that by moving everything into a single data abstraction language (in this case XML), you can perform operations upon this information using an abstraction query and update language, regardless of where or how it was stored. XQuery is that language. XQuery was designed using the same relational calculus that SQL itself was built on, but with the addition of utilizing XPath (specifically XPath 2.0) to retrieve nodes within trees, the introduction of modularization (using namespaced notation) in order to create specific libraries of methods and establish an OOP-like capability, and the ability to extend these functions using the host language for the XQuery engine. This means that with some work you can create an extension that handles SQL calls from XQuery, or performs complex financial calculations, or even resizes graphics as the case may be. Moreover, as with SQL, there are operators (the XQuery Update Facility) that make it possible to save XML resources into the associated data abstraction, providing for the full publishing pathway - you can push XML out through REST interfaces (described shortly) to clients, and you can receive XML from clients for publishing back into the appropriate queue.

REST has become something of a buzzword of late, but at its core is a powerful idea - that one of the more effective ways of working wth data content on the web is to treat all operations as, ultimately, publishing operations - getting an object from a collection, putting objects back into a collection on the server, posting objects to a queue (or collection) of related content, delete an object from a collection, and get publishing metadata about that object. This approach forces a change in the way that we think about applications, most of which are far more oriented towards services than objects because that approach is more familiar to procedural developers, but the RESTful approach typically more closely approximates the way that we work with database operations - and as such makes the atomicity of transactions much easier to maintain because at any point in the workflow, the whole state of the object is within the immediate process of operation.

Put another way, imagine that a given object (such as a base description of a city in some model) is contained within some kind of object "wrapper". In many client-server applications, the state of a prototype city object may start out on the server, then certain aspects and properties of that city are sent to the client. The client may change those properties, and while those properties are being changed there are in essence two partial models of the city - the older one that exists on the server, and the partial new way that exists on the client. If changes are made on the client, the server middle-tier layer becomes responsible for folding those specific changes back into the city model. If the application is designed well, at any given point, the server state should be internally consistent, but if its not designed well, it's possible for the server city model to enter into an incomplete or even a corrupt state, in which information exists in the partial model that is incompatible with the underlying data model.

Moreover, the middle-tier in this particular architecture has to be very specialized and tailored in order to handle each of these state transitions uniquely. For a complex application, the number of potential state transitions grows geometrically, and if the application becomes too large the potential edge cases become catastrophically large and interdependent.  Indeed, it's my contention that one of the reason why large projects in particular tend to fail so much is because people significantly underestimate the number of potential state transitions and as a consequence underestimate the amount of work necessary to either handle them or build exceptions for them.

In a RESTful application, on the other hand, at any given point in the workflow the object in question is always within one process, and is always "atomic" - you can't break the object down into independent subcomponents. One upshot of this atomicity is that you don't need to maintain middle tier state transition managers. The object is created on the server as a prototype, it is sent to the client as an XML entity, once on the client you can change properties while maintaining modeling constraints, and if the model remains internally consistent (and it should, if the application technology is doing its job) then the resulting entity can be sent back to the server to pass through a validation test. If it validates, the object gets added back into the appropriate collection, if it doesn't, then a message to that effect gets sent back to the client, potentially either through a message added to the envelope containing the object or via some back-channel mechanism (more on that in a moment).

Note that nothing here specifies that the primary communication channel is HTTP (or even that the object itself is in XML, for that matter). RESTful systems really first evolved in HTTP-based architectures, but its entirely possible that the primary channel is XMPP (used for instance messaging as part of the Jabber chat application, among many others) or SMTP (for mail transfer), SOAP in a non-RPC medium or any of a number of other potential protocols. What is important here, instead, is using a resource-oriented approach to application development, in general keeping the semantics of the resource distinct from the semantics of the publishing mechanisms.

One of the protocols that I in particular have been watching closely is the use of Atom messages as a mechanism for performing RESTful operations. Atom is used in syndication, providing both publishing metadata (date-time published, author, source, abstract) and categorization information for either feeds or entries within feeds (which in turn can be thought of as a collection and the associated documents within that collection). The Atom Publishing Protocol (also known as AtomPub) is a very REST-centric approach that works well in a variety of circumstances. What's more other transport protocols such as XMPP can be used to carry Atom messages within them.

However, for all of that, the dominant REST-based system is HTTP, and thus a REST-based XQuery enabled system (such as the eXist XML database, MarkLogic's XML Server, or the IBM Syndication Server) also provides a powerful tool for working with such atomic XML "state bundles". Most XRX systems are built with server response/request objects in mind.

The final piece of XRX is one that I've had to think about for some time. Originally, that X stood for XForms, and I still believe that XForms represents one of the best mechanisms for dealing with state changes on the client in order to maintain consistency of the model, but of late I've also begun realizing that its probably more accurate to speak of the second X as "XML-Enabled Client" and leave it at that. What does such a mouthful mean? At its simplest, it indicates that the client is capable of working with XML as a distinct entity without having to decompose it into properties that makes it impossible to reconstruct back into a cohesive XML structure (again, assuming XML is the data carrier here).

The reason I went with this definition was because I realized that there is in fact no specific requirement that the client should in fact have to be a GUI. The GUI exists to provide bindings between controls and the associated XML data model, but as it stood, any client, whether human directed or autonomous, that could consume an XML object of a given "type" and return an XML object of the same type but with different (internally consistent) properties could in fact work just as effectively here. Indeed, it's entirely possible that such a client may in fact be itself an XRX server to another client, such that changes in state that propagated across multiple types of objects may in fact have an effect upon the secondary client. In this respect, a mixed XRX client/server entity can be thought of as a node in a chain (or a network). There's something very important in this, but I'll leave that for a future article to discuss in depth the importance of an XRX network.

There's a subtle point that I tossed in here that some of you may have picked up on. I've been talking about back channels and changing the state of multiple objects though a single post and similar issues that suggest that the publishing model is not completely pure. It's not. Most models are not monolithic, consisting of a single type or collection of object. Instead, the models typically consist of multiple entities that have some relationship to one another that can't be captured in a pure publishing model. Notification messages, for instance, represent one such break from the pure architecture. Let's say I post a city that is internally consistent on the client, but that for some reason is inconsistent with information on the server - perhaps the number of people in the city is larger than the model is designed to handle, and the city is rejected. You could potentially send the rejected information back to the client, but then you have an inconsistent state within a given process. The alternative, and the preferred approach, is that when the post occurs, a new message object is created and place in an error message collection. It then becomes the responsibility of the client to periodically query that collection in order to retrieve this message, and to display it in a "back channel".

In a SQL context, this would be called a trigger - an event occurs during a REST operation, and the trigger then changes the internal state of the server model to more properly reflect the new state by performing a specific action. I believe that in an XRX model you must have some way of registering triggers on specific publishing events, and that those triggers are themselves XQuery commands.  Needless to say, message queues provide one obvious use of such triggers, but others could be triggers tied into deleting entities that may also delete linked content (or break back-links to existing content), triggers that perform external applications such as sending out email messages when a new item is posted, or that sends out a server ping to indicate that a message has been added to a message queue.

This points out  one of the central tenets of distributed programming - the goal is to minimize coupling and interdependencies, recognizing in the process that such interdependencies are nonetheless sometimes both unavoidable and necessary. The interdependencies on the client of an XRX system are set by constraints within the data model on the specific object being modified, but on the server, the interdependencies occur between objects within the system, and the constraints are maintained by triggers and actions.

This also leads to one final observation about XRX systems - the represent surprisingly pure instances of patterns. For instance, most XRX systems work by heavily employing a subscribe/publish model, where people subscribe to a given service, and either through polling or some other technique (such as an action pinging a listening server) when a change occurs the system downloads the changed entities (or their proxies). Note that not all operations actually require that the object itself be sent back - many instead simply need an abstract of the changed objects (such as might be contained in an atom feed), or even a represent of a control list (such as an AJAX based select control that lists the relevant items, or a table showing the relevant information with a check to indicate selection). Since the role of such output is to provide context for selection in order to perform some other operation, this is not inconsistent with the underlying XRX model.

I'm shying away from specific architecture in this particular post, but am beginning to post information about my own XRX based application, x2o, both on the Metaphorical Web site, and, at least for a while, on my x2o Google Code pages. Note that a secondary goal of x2o will be to provide the foundations for a Drupal-like application built using an XRX framework (likely some combination of x2o and related code from participants like Dan McCreary) which for now we're calling xDrupal.

Connecting XRX and the Metaphorical Web

While the underlying theory in creating XRX applications are sound, its taken me a while to put all the pieces together (indeed, the idea of registering triggers and actions, which seems obvious in retrospect, eluded me until just recently, making it much harder to effectively separate the complexity of the pieces involved). Those of you who have followed my talks over the last couple of years have likely watched this process, and have probably wondered when I'd reach a point where I could my money where my mouth was. I think that time is now approaching fast, but I also would like to point out that my own application is simply one of a number of new XRX applications that are beginning to emerge literally all over the world.

XRX is a philosophy of design, one that I think actually works very well on the web. It differs from a more traditional services-oriented approach in that the idea here is not to treat a server as a stateful OOP object that you can call methods on to perform certain stateful transformations, but rather is more like a database that you can query or update, but because of interdependencies that are not always manifest to the user making these invocations, may alter the state of some (or even all) of the entities within that database context. It's a novel approach, but I believe that as the complexity of the web increases, it will go from being novel to being necessary in very short order.

I had previously mentioned newsletters. I will be publishing a regular (likely monthly) newsletter that people can subscribe to by creating a (free) account on my Metaphorical Web site. I also hope to restart forums for XForms, XQuery and XRX developers, either through this site, through O'Reilly's forum systems, or both. If you are interested in getting involved with the x2o/xDrupal project, please drop me a line.

Kurt Cagle is Online Editor for O'Reilly Media, and lives in Victoria, British Columbia. You can subscribe to his news feed here.

You might also be interested in:

News Topics

Recommended for You

Got a Question?