Recently I have seen some Schematron schemas written by good XSLT programmers which basically represented all assertion tests as custom XSLT2 functions. (Schematron allows this.) The schemas were successful, in that they functioned as desired, but I don't think there was any need to step outside Schematron's capabilities and use functions. Schematron is not XSLT, and has been designed to try to get the writer to declare their assertions separate from tests: in the schemas I saw, there was in fact no assertion text, and the functions returned messages for use as diagnostics.
So when is it unnecessary to use XSLT2 user-defined functions?
- For a start, where the test involves just iterating through a document. The basic Schematron pattern rule assert capabilities take care of that: there is no need for a for-each loop in particular when you have XPaths.
- Next it is unnecessary when there are multiple levels of iteration, such as iterating through the sections in a book and sub-iterating through each paragraph in a section and sub-iterating through each attribute of the paragraph: again this is just a straightforward XPath: book/paragraph/@*
- When the XPath becomes complex, but is still basically nested iterations, you can always use variables to pull loops apart. Here is a weak example: instead of book/paragraph/@* we pull them apart then have a simple test on them:
<sch:rule context="book> <sch:let name="all-paras-in-book" value="paragraph"/> <sch:let name="all-attributes-on-para" value="$all-paras-in-book/@*" /> <sch:assert test="$all-attributes-on-para"> Paragraphs should have at least one attribute</sch:assert>
The advantage of this approach over loops is that it forces the programmer to give a clear name for each variable. Complex iterations, in the absence of names or other documentation, can be difficult to fathom. (For the background on this approach (and why xslt:variable is called sch:let in Schematron, see Landin's seminal 1966 paper The next 700 programming languages.)
- Similarly, it is unnecessary when having multiple iterations through different parts of a document (or documents) and having tests comparing the node sets.
- When it is desired to iterate and test each of the nodes selected by some complex sequence of let expressions, XSLT2 provides some excellent keywords some and every which can be used inside the Schematron assertions (or, indeed, in let. XPath2 provides control keywords that are equivalent to XSLT2's.
So when is it necessary to define XSLT2 function definitions in Schematron? I see three cases: the first is when there is some fancy IO required, such as a fallback system for database access; the second is where there is a recursive function needed for data manipulation of some kind: however, I note that because XSLT2's built-in functions and operators are so strong (compared to XSLT1 at least), the cases where recursion is necessary are far rarer now, and, as I mentioned, there is no need for recursive functions merely to duplicate what XPaths can do. The third problem is where it is more convenient to express some very convoluted XPath access as a function because of information hiding (rather than abstraction) reasons. I think all three of these cases may be rather rare.
In XSLT1-based Schematron systems, there definitely was a weakness that iterating in parallel between two sets of nodes was not convenient: I believe there is no problem under XSLT2.
I think Schematron developers coming in from XSLT would find the let-based approach interesting and convenient, and a good chance to wean themselves off the for-loop mentality.