I wasn't there, but the XML Prague presentations are online now. Here are my thoughts from rummaging through some of them. There was a strong emphasis on XSLT and XPath-based systems: I think this reflects a technical opportunity that has been difficult for the big boys to take advantage of, since it does not fit into their product lines or marketing stories well.
First, this is the first conference where every presentation is online as video. Unfortunately, Australia is 38th in the world for average download times, and my company has a strict policy on such things and is about 1/3 the average European speed in any case , so I have not dared view any. But it seems a great idea if you are on that side of the video divide. So I'll limit myself for now to the presentations with text collateral.
Michael Kay has a presentation XML Schema moves forward. Michael has implemented large chunks of XML Schema in his SAXON XSLT2 processor, and has excellent access to the XML Schema WG as an editor of XSLT2 and XPath2 and member (Invited Expert) to the W3C XML Schema WG. XML Schema 1.1 is currently a 'Working Draft in Last Call' at W3C.
I was particularly taken with this from Michael's last slide:
The XML Schema WG needs help
-reviews and comments
My long-suffering readers will know why I am taken with this: active engagement with and recruitment from the larger community is a sign of a healthy standards group.
Michael has some interesting comments on the proposed assertions mechanism for XSD 1.1
•Restriction by grammar
-requires repeating the content model
•Restriction by assertion
-just say what‟s different
-can do "deep restriction"
test="empty( .//@currency[. ne „USD‟] )"
Maintenance nightmare? That is quite strong (and obviously not always the case) and Michael has quite a few brutally honest comments about XSD 1.1. I don't think the XML Schema WG has much alternative, now there is so much information around on the mismatches between what XML Schemas is good at (and there is no need to pretend that XML Schemas is bad at everything) and what people need.
I quite like the Conditional Types Assignment mechanism, in theory: if you have to have types then the more flexibly they can be selected the better. I would have preferred an alignment with RELAX NG however, to allow attributes in content models. That is neater; but I think they may solve slightly different problems: if CTA were also applicable to and in xs:group and xs:attributeGroup it would get closer to RELAX NG's power. XSD's distinction between complex types and groups is terribly incoherent, it seems to me.
But at least the CTA concept shows a glimmer that the XSD Schema WG is realizing that their earlier idea, which was that attributes are funny kinds of elements, is wrong for important classes of documents, where it is more like element names are a funny kind of attribute value. This conception is one of the fundamental differences between databases and documents as idioms, I would say.
Michael makes the interesting observation that the addition of assertions makes XSD compete with Schematron, but I would perhaps say that people who can switch over to XSD 1.1 assertions from Schematron probably were not using Schematron very well! I am not being dismissive here: in fact the ISO Schematron standard even has an annex on how to take subsets of Schematron and borrow them in other languages. But XSD 1.1 assertions do not allow assertion texts (Michael calls them "error messages" which utterly gets it wrong from the Schematron perspective), nor external documents, nor phases (progressive validation), nor dynamic diagnostics (which are "error messages"), and they have scoping restrictions. But way better than nothing.
I guess my problem with XSD 1.1 is that solves the problems that were pressing (from my POV) in 1999, while in 2009 we have a whole different set of schema-related problems. External codes lists, data broken between files, the rise of inline notations that are not usefully validated by simple regexes (yes, I said notations!), and many challenges with schema evolution and schema variants (evolution is not linear.) Schematron is still, I think, the only standard schema language with any reasonable story on these (except notations), though the various improvement to RELAX NG/NVDL on the cards will certainly move things forward. So while I am delighted that XSD looks like getting assertions 10 years too late, it is 10 years too late.
It looks like a stimulating talk I would have enjoyed. I use SAXON XSLT on almost every project, and most programmers I know use it by default. If you are using Java, it is certainly worth looking at seriously. (It has a .NET version too, which should be just as good.)
Tony Graham has a good general talk on Testing XSLT. He quite likes unit tests in moderation, but is not keen on metrics in general, I think. The end is the most interesting, where he emphasizes that human eyes are always needed for testing. His numbers are based on Microsoft's Steve McConnell's book, and I am entirely suspicious of the generality of the numbers: different companies and processes and programming culture and characteristic flaws are so specific that sectoral data may not be very reliable: NASAs study that showed that metrics which accurately reflected their past bugs turned out to be poor at finding or predicting new bugs suggests this.
The talk has a good list of current test tools.
Jeni Tennison follows this up with a specific look at XSpec, a unit testing system for XSLT that looks reasonable. A
scenario (equivalent to a pattern in Schematron) has a
context in the input document (equivalent to a rule context in Schematron) and then
expect for patterns in the output document (equivalent to an assertion test to an external document in Schematron).
XSpec has some shorthand and grouping mechanisms (
context/@mode). It would be much terser than Schematron for this kind of use, because rather then expressing the context and test as XPaths, it expresses them as exemplars. Schematron has never taken off for functional testing, even though it is perfectly capable of it, and I think XSpec shows that, at least for simple constraints suitable for exemplars, the XSpec kind of arrangement is less work.
Ken Holman gave a talk Introduction to Code-Lists in XML. I think this should be required reading for any professional involved in schema creation.
(But you know those tiny bibles engraved on a piece of rice?... Ken must have had a sore throat that day.)