Managing XSLT projects with XPath

By Rick Jelliffe
April 1, 2009 | Comments: 3

One of the biggest changes in the way we (wearing my Allette Systems hat) do things at my office over the last five years has been a thorough but largely unplanned adoption of XPaths as a key tool for managing XSLT projects.

It started with our (wearing my Topologi hat) Topologi utilities, one of which generated various metrics, including a complete list of all unique XPaths in the document. It turns out that this is surprisingly useful information for effective project management of XSLT developments.

For example, when costing and estimating variations in development projects, we can use the change in the number of XPaths. Now I did make a more sophisticated version of this with my XML Structured Document Metrics, however it seems the raw unique XPath lists have taken off more, because the lists can be used for more things. (The Structured Document Metrics are still useful at more top-level checkpoints, it seems.)

For example, today I saw an interesting use from a colleague. He writes XSLTs to generate InDesign XML documents from data from a collaborative CMS (PageSeeder.)

When he first gets his sample input documents, he runs an XSLT to generate a spreadsheet with all the unique XPaths and their counts. When he writes his XSLT code, he also generates an instrumented version of the code which generates elements that give the XPath of the element in the original document. (no index predicates are necessary here.)

Now running the input documents through this augmented version, and then running a report transform on the augmented output documents, he obtains a list and count of all unique XPaths that were consumed in processing the document. This gets fed back into the spreadsheet, and, Bob's your uncle, he has a nice list showing the current coverage by the XSLT program of his input documents.

For example, the spread sheet might say /book/section/p/emph 18 18 which indicates that every occurrence of the emph element has been handled, in that context. But /book/section/section/table 15 0 indicates that tables have not been handled in that context at all. And, perhaps most interestingly, /book/section/section/heading 12 10 would indicate that not all inputs are being tested.

Now, isn't this another kind of unit testing? Well, perhaps, but it is very effective because spreadsheets have one great quality that unit-test listings don't have: they are very manager-friendly. Quasi-technical managers (what is a better term? it is not non-technical) can get the idea of an XPath easily, and the idea of a count. And the counts help estimate completion rates and so on.

So lets hear it for the humble XPath! Huzzah!

You might also be interested in:


Your Xpaths article helped me make what I hope is a useful decision. My virtual xml editor, currently in beta, indexes the xml and includes a unique paths list at the end of the index file. In addition to Xpath query, I have a text search to which I was thinking of adding an option to specify the path that the text must occur in. A combobox containing the unique paths will be provided, and now I think I will add the ability to copy the list to the clipboard as that may be helpful to some users.

Thanks for the article.

What is the XSLT that he uses to generate the xpath spreadsheet?

I'd be very interested in the xpath spreadsheet generator as well, especially if it uses MS OOXML.

News Topics

Recommended for You

Got a Question?