Converting XML Schemas to Schematron (#16): XML Schema Test Suite results for beta 0.5

The suite smell of success?

By Rick Jelliffe
August 5, 2009

Paul Hermans has a really good blog Living in the XML and RDF World. (I would have put it worlds!) I often find it dealing with the same kinds of technical issues that I have to come across in my day jobs, and my readers with a production XML interest (rather than the office document interest) will find it stimulating: mainly hardcore technical material and very useful. Paul has had a recent interest in pipelines and RDF, for example.

Paul has kindly set up a process (I believe an XProc pipeline using Norm Walsh's Calabash and Michael Kay's SAXON 9) to test the XML Schema to Schematron converter I have been documenting in this blog over the last few years. I made some changes in the initial round of tests to get it to bootstrap without hanging: some tests against circular inclusions for example. With the addition of the better namespace handling (to get it to a level to support the IPO example from the XSD Primer) Paul has run the tests and I'll be putting them as part of the ZIP file.

Here is an example of some results:

Green = valid. Blue = invalid. Red = false invalid. Yellow = false valid.

msData/errata10/errA001.xmlTEST :Primer Errata : E0-23 Clarification: test that facet fractionDigits can be added to all numeric datatypes as long as value is 0 (except for decimal which takes any value)validvalidtrue
msData/errata10/errF001.xmlTEST :Primer Errata : Errata E2-35: length facet is now allowed with either minLength or maxLength if they are specified in different derivation stepsinvalidinvalidtrue
msData/errata10/errA002.xmlTEST :Primer Errata : E0-10 Error, E1-11 Error: test that ##other namespace is any namespace other than the target namespaceinvalidvalidfalse
msData/errata10/errA003.xmlTEST :Primer Errata : E0-15 Error, E2-12 Error: test lexical representation of gMonthvalidinvalidfalse

In the first case, the test was expect to produce a result of valid (the first green), the validator returned a result valid (the second green), and this is an OK result (the third green.) In the second example, the test was expected to fail (the first blue), the validator returned a result of fail (blue), and this is an OK result (the third blue). A consistent band is what we want.

In the third case, the test was expected to fail, but the validator did not pick up the error and returned a positive. So the third column is yellow. Yellow is at least regrettable.

In the fourth casem the test was expected to pass, but the validator generated an invalid result. So the fourth column is red. Red is probably bad.

For the purposes of this converter, I weigh false negatives as more important that false positives: we want to catch problems if we can find them, not reject documents on spurious grounds.

The test results are quite interesting, but a numerical or even visual/squinting interpretation of the results would be meaningless in our case for four reasons:

  • The tests explore edge cases. There are over 120 tests on wildcard particles alone, for example, and scores of tests for regular expressions.
  • The tests mix up tests of schemas and instances randomly. The XSD to Schematron converter is designed assuming that the schemas are valid and correct, so if the test results relating to schema-correctness are correct it is a nice bonus but nothing more.
  • There are many features of XML schemas we deliberately do not implement yet, such as xsi:type, xsi:nill, key/keyref, wildcards, final, union, list and these are well-represented in the tests.
  • Many of the tests results are expected to be valid. Therefore our default operation of not testing anything we don't understand will tend to produce a result 'valid'. So the results of three greens is not necessarily indicate successful implementation of the feature purportedly under tested.

So with this big caveat that the test results are not a score card, except perhaps on a really coarse level, here are a couple of results as a indicator of progress.

The test collection relating to model groups is one I was hoping we had implemented:


The reds seem to be related to the use of the wildcard any which is not supported for the moment, so the results are probably pretty good. (But that there are so many expected green results (valid) does not necessarily mean that the tests succeeded: they many not be tested at all. An even mix of positive and negative tests would be much more sound. We could have got almost as good a result merely by returning 'valid' every time!)

For the related group tests:


This is a much more mixed bag, but I suspect it is the same issue in most cases: maybe there is a bug in detecting that a required element is missing.

In retrospect, what would be great is some indication from the validator that an unsupported feature has been used in the schema. But there are so many of them, this would be a fulltime job for a few months in its own right, and putting in the logic to detect the cases would be most of the work of implementing them. That unsettles me, so I think I will have to devise some way of testing that a schema does not have features which are required to prevent false positives (e.g. a feature like KEY/KEYREF can be severed without compromising structure and data tests, while a feature like substitution groups cannot: it is one of the few layers in XSD. The minimal set of elements and cases to support in XSD is insanely large: this lack of layering directly complicates implementation.) Probably a good job for a Schematron schema!

Testing the built-in atomic simple types is quite straightfoward with XPath2: the castable() function can take care of them. So how are we going for attributes?


These are pretty horrible results, but looking in more detail there are a lot of tests for the things we don't implement yet: union, use="required", attributeFormDefault. It looks like there is a problem with attributeGroup that I was not expecting. So disappointing but informative.

Finally, the general tests for datatypes. These are ones that again I expect should mainly be implemented, and it looks like there is a problem with handling the range facet overrides for numbers. Should be easy to fix, but it is a really basic thing I am surprised was not caught before. The other datatype test results are pretty good, actually.


These results apply to version 0.5 beta of the converter. It should be available from the website in a few days (August 2009). Thanks again to Paul for his work on this, and to Norm for Calabash. Having the test suite up is a really great help for the next stage of development: moving from supporting a specific couple of schemas to supporting a broader range.

You might also be interested in:

News Topics

Recommended for You

Got a Question?