Can Schematron validate RAND()?

By Rick Jelliffe
February 28, 2009

Interesting point raised on the ODF TC (related to my MODUS blog of a couple of days ago): can Schematron validate the ODF OpenFormula RAND() function?

Now, at first take this might seem a ridiculous question: Schematron validates documents not applications. However, the context is wider: can current or devised general schema languages be useful for going beyond conventional document testing, with the aim of allowing objective automatic verification of every normative constraint in a standard that includes application semantics? It is a big ask.

So lets look at the smaller specific issue.

Yes, Schematron can indeed validate RAND(). Here is how:

  1. Generate an instance spreadsheet document with, say, 100, 000 cells, each of which is attached to a formula just containing RAND(), which we will say is a function generating numbers between 0 and 1 with an equal chance of any value.
  2. Save this as XML (ODF, etc)
  3. Make up a schema that like the following:
<let name="accuracy" value="0.0001" /<

<rule context="table:table">
<let name="average" value=
"(sum(table:table-row/table:table-cell/o:p)
/ count( table:table-row/table:table-cell/o:p )) " />
<assert test="$average + $accuracy > 0.5 ">
A random value generator will, as the number of
generated values becomes large, have an average
value of 0.5.
&/lt;assert>
<assert test="$average - $accuracy < 0.5 ">
A random value generator will, as the number of
generated values becomes large, have an average
value of 0.5.
&/lt;assert>
<rule>

Note that in this case, we are not testing the statistical property of randomness (you may, indeed, get a generator that is random and happen to get a set of samples that are all low, for example, just as a rare probability), but the kind of randomness that is desirable for computations: evenness.

Of course, you could then have other tests for other properties of pseudo-randomness. Some (such as cyclical patterns) would not be things you would want to test with Schematron, though certainly I suspect you could do a Fourier analysis with Schematron if you wanted to.

(However, this is certainly going into the area of testing, which standards bodies avoid: the testing organizations do that, but in the case of the consortia, they don't have a parallel set of organizations like NIST set up to do testing, so they may do well to have test suites.)

One final point about Schematron: the idea is that the natural language assertion states the constraint, while the test implements it to some level of granularity. So you may have more than one assertion with the same assertion text, that make different tests. A standard may make the assertion text normative, but note that the the assertion test is operational (i.e. adequate for test purposes, necessary but not sufficient.)


You might also be interested in:

News Topics

Recommended for You

Got a Question?