Validating Markov Chains in Schematron

By Rick Jelliffe
July 20, 2010

A Markov chain, according to the great Claude Shannon, can be used to model usage patterns in grammars.

For example, say you have a content model for logging a series of chess moves, it might be


colour = attribute colour { 'black' | 'white'}
from = attribute from { text }
to = attribute to {text }

game = element game {
element pawn { colour, from, to }
element rook { colour, from, to }|
element bishop { colour, from, to }|
element knight { colour, from, to }|
element queen { colour, from, to }|
element king { colour, from, to }}

for example


<game>
<pawn colour='white' from="g2" to="g3"/>
<pawn colour='black' from="g7" to="g6"/>
<bishop colour="white" from="g1" to="f3" />
...
</game>

As I understand it, a Markov model lets you assign a kind of grammar over this information, but one where the transitions have a probability of some element given a certain prior sequence (chain) of elements, not just the current state: for example, if you took all the games of some chess player and analyzed them, you would find characteristic patterns: these would reflect the players skill and the characteristics of winning or losing strategies.

Grammar-based schema language (and Schematron in typical use) don't work with probabilities: or at least they work with a simple model of probability: not-allowed, optional, must-occur, zero-or-more, and then they allow various grouping mechanisms. Schematron does allow a wrinkle, in that you can also generate, for example, reports to call out rare sequences, for example


<sch:rule context="king">
<sch:report test="following-sibling::*[1]
[self::king]
[count(previous-sibling::*) &lt; 20 ]
[contains(@to, '2') or contains(@to, '7')]">
Both players are moving their kings into the next rank before the 20th move.
It is unusual to have this kind of movement before the end-game
</sch:report>
</sch:rule>

However, what Schematron can do is capture a Markov model as a schema, then report whether some document follows that model within a degree of tolerance.

For example, say that the white player has never been known to lose within 20 moves:


<sch:rule context="game">
<sch:report test="*[last()]/@colour='white' ">
White won the game.
</sch:report>
<sch:report test="*[last()]/@colour='black' ">
Black won the game
</sch:report>
<sch:report test="*[last()]/@colour='black' and count(*) < 20">
White lost the game in less than 20 moves
<sch:report>
<sch:rule>

Say we find a chain which says a sequence of knight, king, knight, king, queen, queen will be followed by a king at 40%, a queen at 40%, a bishop at 10%, a knight at 5%, a rook at 3% and a pawn at 2%. chance. Then we decide that, for any given game, we are interested in finding exceptions where a pattern occurs more than an expected rate.

We can validate it like this:


&sch:rule context="game">

<sch:let name="king-count" value="
count(knight/
following-sibling::*[1]/self::king/
following-sibling::*[1]/self::knight/
following-sibling::*[1]/self::king/
following-sibling::*[1]/self::queen/
following-sibling::*[1]/self::queen/
following-sibling::*[1]/self::king)" />

<sch:let name="queen-count" value="
count(knight/
following-sibling::*[1]/self::king/
following-sibling::*[1]/self::knight/
following-sibling::*[1]/self::king/
following-sibling::*[1]/self::queen/
following-sibling::*[1]/self::queen/
following-sibling::*[1]/self::queen)" />

<sch:let name="bishop-count" value="
count(knight/
following-sibling::*[1]/self::king/
following-sibling::*[1]/self::knight/
following-sibling::*[1]/self::king/
following-sibling::*[1]/self::queen/
following-sibling::*[1]/self::queen/
following-sibling::*[1]/self::bishop)" />

<sch:let name="knight-chase" value="
count(knight/
following-sibling::*[1]/self::king/
following-sibling::*[1]/self::knight/
following-sibling::*[1]/self::king/
following-sibling::*[1]/self::queen/
following-sibling::*[1]/self::queen/
following-sibling::*[1]/self::knight)" />

<sch:let name="rook-count" value="
count(knight/
following-sibling::*[1]/self::king/
following-sibling::*[1]/self::knight/
following-sibling::*[1]/self::king/
following-sibling::*[1]/self::queen/
following-sibling::*[1]/self::queen/
following-sibling::*[1]/self::rook)" />

<sch:let name="pawn-count" value="
count(knight/
following-sibling::*[1]/self::king/
following-sibling::*[1]/self::knight/
following-sibling::*[1]/self::king/
following-sibling::*[1]/self::queen/
following-sibling::*[1]/self::queen/
following-sibling::*[1]/self::king)" />

&sch:let name="total" value="
$king-count + $queen-count + $bishop-count
+ $knight-count + $rook-count + $pawn-count" />

<sch:report id="pawn-chase" test=" $total > $total-threshold
and $pawn-count * $pawn-multiplier < $total">
The number of knight,king,knight,king,queen,queen,pawn moves
is larger than is typical.<sch:report>

</sch:rule>

In this case, we need to figure out and set what our total-threshold and what the pawn-multiplier is, based on the probabilities in our (hidden) Markov model.

(This is really a hidden Markov model, because we don't model the internal grammar per se, just the probabilities. And you would need to be very careful in what you actually are reporting on: for example, in the rule above it may be quite unusual to have a pawn chase, but it may be that every game that has a pawn chase has multiple of them: a game with a five pawn chases does not have five unusual sequences, but one unusual sequence followed by some common ones: this is a Markov chain outside the Markov chain...)

Where the probability of some transition in a (non-hidden) Markov model is explicitly zero, then we are in more familiar territory for validation: the Schematron assert element can be used rather than report, to emphasize that that chain should never occur. (However I will leave the issue of how to validate against transitions which never occur for now: it is the more interesting question: what is needed is to validate that every node participates in at least one allowed chain in some correct position and also substitutes in for another element in a chain with the same left and right contexts.)

Anyway, the point is that Schematron can be used to report on rare sequences based on thresholds and probabilities from a hidden Markov model, as well as validating illegal sequences.

Caveat: I haven't run or tested this code, the intent is to demonstrate the idea. I think it is OK, but corrections are welcome, if you find an error that bugs you.


You might also be interested in:

News Topics

Recommended for You

Got a Question?