Weak validation using hash codes

By Rick Jelliffe
September 7, 2009

High performance gateways are a potential use case for efficient weak validation systems. A few weeks ago, I blogged about validation using tries and feature sets.

Another weaker validation would be to validate using hash codes. Given some efficient system of generating a hash code from an element or attribute name, then for every content model in a grammar generate various minimum values: the minimum hash code of the first position, the minimum hash code of the last position, the minimum hash code of any valid path through the content model.

Validation then becomes merely a matter of, for each element start-tag event, generating a hash code and adding it to the parent's lrecord such as running total, and then for each end-tag even checking the values against the expected. It would obviously be more successful on content models with mostly required elements. Some other properties derivable from the content model would also fit, such as the minimum number of children valid in the content model.

The same technique can be used for attributes and data values. It could be a coarse seive ahead of finer-grained techniques.


You might also be interested in:

News Topics

Recommended for You

Got a Question?