Ruby Schematron

By Rick Jelliffe
June 3, 2010

Francesco Lazzarino has a project up at RubyForge for a Ruby runner for ISO Schematron. (Open source: MIT/ Consortium License) Schematron is a small ISO-standard language for making assertions or reports about patterns in and between XML documents, typically using XPath.

Here is the guts. I don't know enough Ruby (well, any) to know if it is good code but if it operates with line numbers, that looks pretty good. I'd love to see two things added: something to select phases (most importantly) and something to provide parameters (or is it there already?) Any Ruby-chans with a spare hour might like to contribute it,

require 'libxml'
require 'libxslt'

module Schematron

include LibXML
include LibXSLT

# The location of the ISO schematron implemtation lives
ISO_IMPL_DIR = File.join File.dirname(__FILE__), "..", 'iso_impl'

# The file names of the compilation stages
ISO_FILES = [ 'iso_dsdl_include.xsl',
'iso_abstract_expand.xsl',
'iso_svrl.xsl' ]

# Namespace prefix declarations for use in XPaths
NS_PREFIXES = {
'svrl' => 'http://purl.oclc.org/dsdl/svrl'
}

class Schema

def initialize(doc)
schema_doc = doc

xforms = ISO_FILES.map do |file|

Dir.chdir(ISO_IMPL_DIR) do
doc = XML::Document.file file
LibXSLT::XSLT::Stylesheet.new doc
end

end

# Compile schematron into xsl that maps to svrl
validator_doc = xforms.inject(schema_doc) { |xml, xsl| xsl.apply xml }
@validator_xsl = LibXSLT::XSLT::Stylesheet.new validator_doc
end

def validate(instance_doc)

# Validate the xml
results_doc = @validator_xsl.apply instance_doc

# compile the errors and log any messages
rule_hits(results_doc, instance_doc, 'assert', '//svrl:failed-assert') +
rule_hits(results_doc, instance_doc, 'report', '//svrl:successful-report')
end

# Look for reported or failed rules of a particular type in the instance doc
def rule_hits(results_doc, instance_doc, rule_type, xpath)

results = []

results_doc.root.find(xpath, NS_PREFIXES).each do |hit|
context = instance_doc.root.find_first hit['location']

hit.find('svrl:text/text()', NS_PREFIXES).each do |message|
results << {
:rule_type => rule_type,
:type => context.node_type_name,
:name => context.name,
:line => context.line_num,
:message => message.content.strip }
end
end

results

end

end

end


You might also be interested in:

News Topics

Recommended for You

Got a Question?