Is It Time for an EXQuery.org?

By Kurt Cagle
January 7, 2009 | Comments: 8

Eight years ago, after the XSLT 1.0 specification had been around for a couple of years, those of us who had been working fairly heavily with the early specification came to the collective realization that something was missing. XSLT1 had a number of holes, including the inability to create intermediate trees, very limited math functions, no way to parse strings (or create strings from sequences of nodes) and so forth. Additionally, while the rather ambiguously worded statements about extension functions in the XSLT specification indicated that extension functions were possible, it also made it very clear that this was a vendor specific issue.

What this meant in practice was that as vendors and project developers put out differing XSLT implementations, the support that they offered for extensions varied wildly, from none at all to pre-defined function sets to an open-ended API for extending the language, depending upon the vendor. What's worse, even in cases where two vendors did establish extension libraries, their implementation and function signatures varied dramatically from one another, making portability of XSLT scripts a real problem.

In the spirit of the original standard, one solution that made a fair amount of sense was to establish a library of routines outside of immediate vendor control, which Jeni Tennison, Jim Fuller, Dave Pawson, Uche Ogbuji and other well known XSLT hackers did by establishing the EXSLT.org project in 2001, the name

The goal for ESLT was simple - create namespaces for library modules in several different areas - core XSLT (for functions like node-set(), which let users convert node fragments into trees), math functions, regular expression methods, string functions, dates and times, set manipulation, and related areas. Where possible, XSLT-based solutions were also suggested in those cases where an extensibility framework couldn't be fully implemented, though these were generally last resort routines.

The real power of EXSLT came as a way of establishing a standard, platform independent set of functions that could be implemented by XSLT processor vendors or third parties with those tools, and while it took awhile, nearly all XSLT implementations out today support at least a subset of EXSLT.

In a recent article for IBM DeveloperWorks, Jim Fuller (part of the original EXSLT team) raised the question of whether XQuery is in fact reaching the same stage of needing a consistent vendor neutral extension library:

Most XQuery implementations have added their own third-party functions, providing all manner of additional capabilities. The obvious issue with using such extension functions is that you make your XQuery code potentially incompatible if you depend on a specific, non-standard functionality exposed by a specific implementation.

I fought this fight against vendor lock-in before in assisting with the EXSLT effort. I am not surprised that it comes up again in XQuery.

I'd come to much the same conclusion as I reviewed a number of XML Databases recently ... while most have made the jump to the final release of XQuery 1.0, there was very little consistency with methods that fell outside of these core functions, even when there was obviously similar thinking with regard to what was needed.

As one obvious instance, the random() function is quite useful in XQuery for generating simulated content or for statistical sampling, both of which occur quite frequently in applications I've worked with, yet a random number generator was deemed as outside of the scope of XPath 2. Unfortunately, different systems implement a randomizer function in different namespaces, often with different function signatures, as Table 1 illustrates:

VendorRandom() Function(s)
Saxonmath:random(number), random:random-sequence(number,number)
MarkLogicxdmp:random([$max as xs:unsignedLong]) as xs:unsignedLong
eXistmath:random(xs:double) as xs:double, util:random(xs:integer) as xs:integer, util:random() as xs:double

Other databases, such as IBM DB2 Pure XML, don't define a random() function at all, but rather provide a set of Java or C# hooks to implement one from the appropriate foundation classes.

In many cases, the EXSLT functions (which are in reality XPath functions) can be used quite effectively within an XQuery context as they stand, though when XPath 2.0 was developed -- which forms the XPath basis for XQuery -- the EXSLT model was carried over as applicable so there's a lot of overlap between EXSLT and XPath 2.

The biggest holes are in more systemic areas - parsing and serialization control, document validation, higher order evaluation and invocation, more sophisticated ordering capabilities, tag crossing text search capabilities, math functions, web services invocation, server environment functions when in that context, document enrichment, even geospatial functions.

These are all common operations on XML databases in particular, which I suspect will represent the bulk of all XML search and manipulation in the near future. Other areas are more problematic, but just as useful - IMAP or POP3 mail production, SQL integration languages, XSLT integration as well as common user and group validation and related database functionality.

For the most part, new EXQuery functions would simply represent wrappers around existing XQuery extension functionality in order to provide a consistent interface between databases. It would also set a bar that determines the minimal expectation of such databases and data systems and provides a way for new entrants into the field to be able to XQuery scripts without having to refactor code.

In an era of data abstraction, web services and distributed data, EXQuery just makes sense. Please let me know what you think about getting involved in an effort like this?

Kurt Cagle is Online Editor for O'Reilly Media. He can be followed via his Atom feed or on Twitter.


You might also be interested in:

8 Comments

We've talked about doing something like this within the XML Query Working Group, and also with the XSL Working Group since he functions and operator spec is joint with XQuery, XPath and XSLT.

I've also considered making somewhere on www.w3.org for community projects, and hope to make more noise about such things in 2009. For now, of course, people are welcome to use the W3C www-ql public list to discuss things they'd like to see standardised across implementations, as well as to make public comments on the relevant specs using bugzilla, as always.

Some of the things you mention are also under consideration for XQuery 1.1, and of course we also have the Full-Text extensions for searching.

Liam

What about simply just using http://xQuery.org ?

Hi Kurt,

A few weeks ago, the need for a new EXSLT set of extensions, for XSLT 2.0, showed up again on XSL List (at Mulberry Tech.) The result was a few discussions about possible extensions and I've set a wiki up to keep tracks of them, available at http://www.fgeorges.org/exslt2-wiki/.

I do think that most of the extensions needed by XSLT 2.0 are XPath 2.0 extensions, and would benefit XQuery as well. Actually I am reading the extension libraries of some XQuery implementations to take them into account to define some EXSLT2 extensions.

Most of the interesting areas you are mentioning in your post have been discussed or at least mentioned for EXSLT2. I think only server environment functions weren't discussed, but even them would be interesting for XSLT 2.0 I think.

I think both communities would benefit from each other if the effort was common. Exactly as developing XPath 2.0 commonly by both XSL and XML Query WGs was a great idea.

For more info about EXSLT2, please see the original EXSLT mailing list at http://lists.fourthought.com/mailman/listinfo/exslt and the wiki at: http://www.fgeorges.org/exslt2-wiki/.

What do you think?

Kind regards,

--
Florent Georges
http://www.fgeorges.org/

I've been asking for something like this from inside the XQuery WG for a while now, as well as being involved in discussions about EXSLT 2 recently. Count me in - I think we're getting a critical mass here.

Kurt,

I agree with everything you state in this post. Jim Fuller also has some very good ideas about the creation of portability wrapper modules.

I have been trying to use the eXist native XML database as an web application server and I would like to be able to not only exchange XQuery-based applications with my other friends using eXist but also with other native XML databases like MarkLogic and DB2 PureXML.

What I would like is not just a set of standards for getting URL parameters and random numbers functions but a plugin architecture for making XQuery application more portable. So features like site-wide navigation menus fedrated keyword search and site-wide help all just work when you drop in an XQuery application into a collection in the "apps" collection.

Much of this application portability can only be achieved by standardizing on the information an application publishes to the execution web site. In this light I am working on a specification called the XQuery Application Information (AppInfo) that could be part of an overall XRX framework. Basically it relise on a single XML configuration file for all applications that describes the resource of the application. This file could then be used by any database vendor that wants to import the application to its customers. This is similar to the plugin architecture we see in Eclipse.

Please let me know if others are interested in this topic.

John,

I am very glad that an XQuery implementer is interested!
Any help is welcome. And I'd like to have soon a first
useful extension that would be implemented in both XSLT and
XQuery (and plain XPath.) An interesting point with several
XQuery implementations is that they have already a set of
useful extensions, so some EXSLT functions should be able to
be written themselves in (implementation-dependent) XQuery.

Regards,

--
Florent Georges
http://www.fgeorges.org/

Dan,

This is a very interesting point. And I think such a
framework could benefit from being written itself in XQuery,
so I guess EXSLT 2.0 could be of help here. And on the
other hand, I think we could benefit too from such a project
simply because it would point out which features would be
pertinent in that context, and what coud be built in top of
them.

I would really like to have a set of web-application
oriented functions to be able to define the context of
evaluation of an XSLT stylesheet on a web server, and
clearly, the use cases and experience in that field should
be got from XQuery.

Do you have any link about AppInfo?

Regards,

--
Florent Georges
http://www.fgeorges.org/

thx Kurt for the reference.

As for Liam's comment of incubating EXSLT type efforts from within the W3C or any other standards organization as part of a larger community wide consultation; its a good idea but I would argue difficult to achieve.

XML is lucky in that it has an active community and existing 'places' on the net for developers to go and cogitate.

I have a few ideas on how to go about building such communities ... I think to achieve some sort of gravitas/authority they need to be independent of companies, standards bodies, etc and represent individual developer concerns.

I would imagine that the W3C may gain some traction in supporting and growing such communities via grants & sponsorship routes instead of direct and internal efforts (which may eventually clash with W3C member aims).

There is an emerging model that I have been following, that seems to be working with the Perl6 development (aside from the very long timeline ;) ) in that there is a community that raises funds and chooses active and productive individuals for grants to forward the aims of the language.

This might work for W3C standards but is potentially problematic if the W3C is seen to be financing efforts such as the development of reference implementations of their standards (members may get upset at this!).

I guess what I am saying is that standards development is too top heavy (and exclusive) and no matter how big the W3C becomes it won't be big enough to represent the community of developers at large.

Perhaps we need to take a page from Obama's net campaign and learn how to grow a network of active individuals ... by its nature it needs to devolve control and funds to these individuals to grow at the grass roots level what they think is required.

One last thing to note with these kind of things is luck and timing ... EXSLT was facilitated (in 2000-2001) by the internet bubble bursting and lots of 'hands' at the ready to assist. There are more now ;)

cheers, Jim

News Topics

Recommended for You

Got a Question?