Document Design Matters

By Erik Wilde
October 2, 2008 | Comments: 2

XML Fever was our article in July's Communication of the ACM (CACM) on how people attribute almost magical powers to XML. We focused on XML technologies, how people use them, how people get attached to them, and how this utilization and attachment can lead to unfortunate side-effects. Taking this thinking one step further, the October issue of CACM contains our follow-up piece Document Design Matters. Here is the article's abstract:

The classical approach to the data aspect of system design distinguishes conceptual, logical, and physical models. Models of each type or level are governed by metamodels that specify the kinds of concepts and constraints that can be used by each model; in most cases metamodels are accompanied by languages for describing models. For example, in database design, conceptual models usually conform to the Entity-Relationship (ER) metamodel (or some extension of it), the logical model maps ER models to relational tables and introduces normalization, and the physical model handles implementation issues such as possible denormalizations in the context of a particular database schema language. In this modeling methodology, there is a single hierarchy of models that rests on the assumption that one data model spans all modeling levels and applies to all the applications in some domain. The one true model approach assumes homogeneity, but this does not work very well for the Web. The Web as a constantly growing ecosystem of heterogeneous data and services has challenged a number of practices and theories about the design of IT landscapes. Instead of being governed by one true model used by everyone, the underlying assumption of top-down design, Web data and services evolve in an uncoordinated fashion. As a result, a fundamental challenge with Web data and services is matching and mapping local and often partial models that not only are different models of the same application domain, but also differ, implicitly or explicitly, in their associated metamodels.

The article mostly revolves around the fact that XML only rarely is used as a native data model. Instead, in most cases it is an exchange model that is created for some other model, simply because somebody thinks that there will be some value in having an XML-based exchange syntax. Problems arise as soon as there is a mismatch between the metamodels of the application's data model, and the XML exchange model. The classical problem is how to represent non-tree graphs in XML's tree-oriented markup.

We do not claim to have the silver bullet for coming up with the perfect XML-based exchange model fo any given application data model. The main purpose of the article is to point to this area of XML technologies where there is no appropriate technology support so far (there is no XML-oriented conceptual modeling language), and to make sure that document designers are aware of the difficult and important decisions they have to make when creating exchange formats. Here are your options for reading the article:

The official citation for this article is Erik Wilde and Robert J. Glushko. Document Design Matters. Communications of the ACM, 51(10):43-49, October 2008.


You might also be interested in:

2 Comments

How does GRDDL fit (or not fit) into this picture?

wrt GRDDL: Our article talks about the "one true model" world view. GRDDL is more in the realm of the "one true metamodel" world views. It allows users to specify a transformation (in most cases this will be XSLT) for transforming XML into RDF. It is a mechanized way of sprinkling Semantic Web fairy dust over anything XML, assuming that there is a well-defined mapping from the underlying XML data model to RDF.

News Topics

Recommended for You

Got a Question?