OASIS CAM versus ISO Schematron

A new technique for Schematron: exemplars

By Rick Jelliffe
August 21, 2009 | Comments: 1

Every few years I follow up the progress of OASIS CAM: Content Assembly Mechanism. From the horse's mouth:

The Content Assembly Mechanism (CAM) provides an open XML based system for using business rules to define, validate and compose specific business documents from generalized schema elements and structures.

A CAM rule set and document assembly template defines the specific business context, content requirement, and transactional function of a document. A CAM template must be capable of consistently reproducing documents that can successfully carry out the specific transactional function that they were designed for. CAM also provides the foundation for creating industry libraries and dictionaries of schema elements and business document structures to support business process needs.

The core role of the OASIS CAM specifications is therefore to provide a generic standalone content assembly mechanism that extends beyond the basic structural definition features in XML and schema to provide a comprehensive system with which to define dynamic e-business interoperability.

What is CAM?

CAM is a bundling of

  • an example based mechanism for basic structures, a variant on Eric van der Vlist's Examplotron (CAM commentators call it the WYSIWYG approach),
  • an XPath based set of assertions, a variant on the use of XPath in Schematron, and
  • a kind of selection/number-formatting mechanism for assembling documents, simpler and less powerful than XSLT (ISO DSDL vaguely has DSRL and NVDL in this space too, but they are more different than alike.)

I quite like CAM, and I don't really see it as competitive to Schematron, for a variety of reasons, though it certainly overlaps. There will be cases where Schematron is better, there will be cases with CAM is better, I should expect or hope!

Of course, my normal initial reaction when reading the CAM spec is to huff and puff about borrowing ideas badly. No patterns! No phases! No diagnostics! No roles! No assertion text! And so on. But it is to compare apes and orangutans: the things I consider important are not the things CAM was developed for.

I think if there is a technology that OASIS CAM provides a competitor for, it is W3C XSD 1.1. And it certainly provides a good challenge for ISO DSDL. Like XSD, CAM gives a simple transformation of the input document and then tries to validate that declaratively, but it allows much stronger transformations than the mere augmentations that XSD provides.

Schematron as neutral?

Readers familiar with this blog will of course anticipate what is next. I will look at some material someone has written comparing Schematron and CAM, and point out that the conclusion is wrong, not necessarily so or relates to an old version of Schematron. I don't want to disappoint them.

There is a recent article by Michael Sorens Taking XML Validation to the Next Level: Introducing CAM which quotes a comparison David Wheeler made a few years ago Comparing Schematron to CAM, which in turn uses a Schematron example I wrote in around 2001.

Sorens says

Both XML Schema and Schematron fundamentally intertwine semantics and structure. In programming terms, the coupling is high, which is not desirable.

My response is get your stinking categories out of my stinking schema language!

Or, if you like it put another way, CAM's categories of structure and business rules are arbitrary. Which is not to say they are not useful. In Schematron you would model them as phases if you wanted the distinction to be used for validating them differently, or just the pattern/@role attribute otherwise. Schematron does not have an element called structure and an element called businessRule, because it tries to be neutral in regard to analytical categories. The validation constraints of CAM could be implemented in Schematron while retaining CAM's category split by saying "Make a Schematron schema with a phase Structure that only is concerned with structures, and a phase BusinessRules that is concerned with everything else."

(As an example of this arbitrariness, CAM treats containment as a structure issue, and cardinality (optionality) as a business rule issue. But XSD treats cardinality as a structure issue. Schematron avoids the whole thing: the schema developer can have whatever categories they wish.)

The sincerest form of flattery

CAM's extension of XPath used for business rule constraints has some high-value ideas. CAM defines a variety of built-in functions, and these take XPaths as their argument.

Can Schematron steal them?

Here is what CAM can do:

<as:constraint action="makeRepeatable(//Items/Item)"/>
<as:constraint action="makeOptional(//Item/comment)"/>
<as:constraint action="setLength(//shipTo/state,2)"/>
<as:constraint action="setDateMask(//PurchaseOrder/shipDate,YYYY-MM-DD)"/>
<as:constraint action="makeOptional(//PurchaseOrder/comment)"/>
<as:constraint action="restrictValues(//shipTo/@type,'US'| 'CA'| 'MX', 'US')"/>
<as:constraint action="setDateMask(//PurchaseOrder/@orderDate,YYYY-MM-DD)"/>
<as:constraint action="setNumberMask(//Item/@pno,###-###)"/>
<as:constraint action="setNumberMask(//Item/quantity,###)"/>
<as:constraint action="setNumberMask(//Item/price,####.##)"/>
<as:constraint condition="//Item/@pno = 123-678"
Can only ship item 123-678 to Washington State

<as:constraint condition="$QuickBooks = true" action="excludeElement(//Item/comment)" />

And here is how you would do it in ISO Schematron. (This is an indicative fragment, I may have made some minor syntax error.)

<sch:schema xslt:sch="http://purl.oclc.org/dsdl/schematron"   ...>
   <sch:title>Purchase Order schema for CAM Comparison&sch:title>
   <sch:param name="Quickbooks" />
   <!-- Abstract pattern definitions go here, to simulate CAM  -->
     <sch:pattern abstract="true" name="conditional-exclusion" 
         <sch:rule context="$path" >
            <sch:assert test="not( $condition )">The path has been excluded</sch:assert>

<sch:pattern is-a="repeatable-element">
<sch:param name="path" value="Items/Item" />
<sch:pattern is-a="optional-element">
<sch:param name="path" value="Item/comment" />
<sch:pattern is-a="string">
<sch:param name="path" value="shipTo/state" />
<sch:param name="length" value="2" />
<sch:pattern is-a="date">
<sch:param name="path" value="PurchaseOrder/shipDate" />
<sch:param name="format" value="YYYY-MM-DD" />
<sch:pattern is-a="optional-element">
<sch:param name="path" value="PurchaseOrder/comment" />
<sch:pattern is-a="list">
<sch:param name="path" value="shipTo/@type" />
<sch:param name="values" value="'US'| 'CA'| 'MX'" />
<sch:pattern is-a="date">
<sch:param name="path" value="PurchaseOrder/@orderDate" />
<sch:param name="format" value="YYYY-MM-DD" />
<sch:pattern is-a="number">
<sch:param name="path" value="Item/@pno" />
<sch:param name="mask" value="###-###" />
<sch:pattern is-a="number">
<sch:param name="path" value="Item/quantity" />
<sch:param name="mask" value="###" />
<sch:pattern is-a="number">
<sch:param name="path" value="//Item/price" />
<sch:param name="mask" value="####.##" />
<sch:pattern is-a="conditional-list">
<sch:param name="path" value="shipTo/state" />
<sch:param name="condition" value="Item/@pno = 123-678" />
<sch:param name="mask" value="'WA'" />
<sch:param name="text"
value="Can only ship item 123-678 to Washington State" />
<sch:pattern is-a="conditional-exclusion">
<sch:param name="path" value="Item/comment" />
<sch:param name="condition" value="$QuickBooks = true" />

This uses ISO Schematron's abstract pattern facility. The declarations for these abstract patterns are left as an exercise to the reader! For some of them, you might have to resort to extension functions (though the XSLT2 binding for Schematron allows you to define your own functions using XSLT elements.)

Note that the last constraint in CAM excludeElement() does not seem to be a constraint at all, but a transformation. But we can declare it and test for it in Schematron.

Note that the use of abstract patterns also allows us to parameterize patterns in a way that non-Schematron implementations could use the information too. Indeed, the Schematron schema could be converted to CAM. And I would expect that makeOptional would need a better story.

Now just because many schema-related languages can be implemented by converting them to Schematron, and just because Schematron using abstract patterns can often be converted back to other more domain-specific languages, I am not trying to say that therefore all domain-specific languages like CAM are somehow made obsolete by Schematron.

One would hope that an domain-specific language would be more convenient and congenial than a general-purpose pattern language like Schematron. But what if your categories don't fit in with those of CAM or XSD and the others? Schematron's abstract pattern can provide at least a good prototyping workbench for developing and declaring schemas according to your own cateogries. I would like Schematron to be good enough that even people intending to use it merely to prototype validators with their categories will stick with it.

Exemplars in Schematron

Here is something new (I think, but probably Eric has been here before).

I note that it is possible to go even one step further, and make a kind of Examplotron 'WYSIWYG' declaration mechanism using Schematron. Consider this:

   <sch:pattern name="allowed-elements">
      <sch:let name="exemplar" 
           value="document('po.cam')//as:structure" />

<rule context="*" >
<assert test="$exemplar //*[name() = current()/name()]">
The name of the current element should be in the exemplar.
<assert test="not(parent::*) or
$exemplar //*[name() = current()/name()]
/parent::*[name() = current()/parent::*/name()]" >
This element should have one of the allowed parents from the exemplar.

These rules could be expanded to be more satisfactory of course.

The XPaths are of course more convoluted, but they are things you would write once, and then change the exemplar. So even this aspect of CAM could be implemented in Schematron, if you wanted.

You might also be interested in:

1 Comment

Rick, nobody tells me anything! I just stumbled across your delightful piece here by happenstance and Google.

I will definitely post links here. Very insightful as ever.

The CAM work has taken on some interesting new directions particularly now supporting true Content Assembly Blueprints from canonical XML dictionaries of core components. This is where the work really originated; validation was a another interesting piece that we came upon along the way.

One of the fundamental things that CAM has shown it does well is combine with xslt, because the CAM syntax is concise and marries well with xslt traversal and manipulation. The whole CAM toolkit is now exploiting this to allow very cool things to be engineered driven off the base WYSIWYG XML in the CAM template. Someone did mention option of generating Schematron from CAM - and certainly your article gives great hints and tips for someone to follow there.

We are planning to publish OASIS CAM standard upgrade to include the canonical dictionary XML format and the latest features learned from the open source work on the CAMprocessor project.

Thanks, DW

News Topics

Recommended for You

Got a Question?