Tactical and strategic XML design

By Rick Jelliffe
November 5, 2009

An design issue came up in an XML system I have been looking at recently. I can explain it in terms of tactics (ways to win an individual battle) versus strategy (ways to win the whole war).

The issue is that we so often (always?) build systems where the architecture (organizing principles, top-level design, platform decisions, etc.) is harder to change (once deployed) than the data structures, algorithms and user interfaces. Whether or not we use atop-down methodology, we end up with a top-down system.

However, it seems that the more that the architecture is based on strategic lines, the more risky it is that the initial system will fail its goals: the architecture is addressing goals other than the immediate ones. However, the more that that the architecture is based on tactical lines, the more that appropriate decisions about the design made for the first iteration will be obsoleted by changing events in subsequent versions.

A good tactic in one battle may be a liability in the next

For example, in the system I have been looking at, the system is designed so that diverse XML inputs are first merged into a common data format, then multiple publications are extract from this. It is a kind of n:1:m system; pattern people may see the common data format as a facade. The reason for adopting this flow was because originally the input formats were in flux (additions, renamings, deletions) as was the number of outputs and their details: the developers quite rightly said the only way we can delivery something is to freeze the data format, which they did and the system was delivered on time, on budget and working.

But the architecture of this system is based on a tactic not a strategy: the tactic is of reducing the n:m problem to an n:1:m problem (where the n is not only that there were different inputs, but that they were reused over time.) In fact, this same issue arose in several places along the same pipeline, so the system is organized as a series of scatter/gather stages with multiple intermediate "common" formats.

But now the inputs have stabilized: really it is just a 1:m system. And when a change is required on an output, the developer has to trawl back to each stage in the pipeline and potentially make changes at each stage. So the tactic of adopting the multiple facades has created a maintenance problem (the client needs agile changes to the outputs) and the system will be refactored.

The solution, I think, is that the strategy has to be to allow for refactorable architectures as much as possible: lego blocks. The classic solution for this is components: small self-contained bundles of functionality that can be interfaced conveniently but which do not require internal code changes for use in different positions. I would see web services as a kind of component system (small w, small s, not necessarily the whole WS* malarky.) A DBMS might be a component too.

So the question becomes, in part, how do we build XML pipelines or XML systems which don't require internal code changes for use in different positions? That is much more tricky, but I think one answer is that an XML component should only validate, inspect, change or verify the particular elements it is functionally concerned with.

(Regular readers will see what is coming next...I cannot help myself: unless the XML component is actively interested in all parts of one namespace, XSD as practiced is unsuitable, because it will check things unrelated to the functional concerns of the XML component. I think this is one reason why XSD is so rarely used internally in systems, or provides documentation, or is used because someone likes a tool that uses XSD. )

Identify tactical and strategic use of XML

So I guess when we look at a system's architecture, the first thing we can do is ask Is this XML here being used strategically or tactically?. A strategic use might be, for example, to allow long-term archiving; a tactical use might be XML in AJAX (where using JSON would be another tactic.) If the answer is tactical, then we can ask Is it implemented in a way that allows flexible rearrangement, when a different tactic becomes appropriate?

For example, looking at an XSLT2 transformation to convert from some data into an output format using pull programming (where there is a fairly fixed output structure, as distinct from push programming, where the information present in the XML determines the structure of the output): the script will typically use xsl:for-each expressions rather than xsl:template. Componentizing this for flexibility might involve making up a named function for all the selection and sorting XPaths of those xsl:for-each expressions. And moving the function definitions to a header or an included file. This parameterizes the stylesheet by the access functions, and may make it easier to plug in a different data feed: the XPaths to be changed are in one place, collected and named.

Other angles appreciated...

You might also be interested in:

News Topics

Recommended for You

Got a Question?