How fuzzy should a date be?

By Rick Jelliffe
November 22, 2009 | Comments: 1

From Bruce D'Arcus' Darcusblog comes a pointer on a U.S. Library of Congress initiative for a better date format Extended Date Time Format (EDTF).

ISO 8601's problem is that almost anything is a date: if my memory serves me, some date values are ambiguous so you need to make a subset or add some attribute to say which kind of date you mean.

W3C XML Schemas 1.0 datatypes does use a subset of ISO8601, in fact multiple subsets: xs:date, xs:dateTime, xs:gDay and so on. They are not derived from one another because their value spaces are different. Predictably, I think this shows a significant problem in the XSD 1.0 type system: the Emperor seems impervious to draft however. (I am beginning to suspect the Emperor is a flasher, actually.) XSD 1.1 improves dateTime, by the way, to include timezone offsets from UTC, which I expect would meet with general approval as a good improvement.

The XSD 1.1 CR specification has a very useful section discussing date/time issues: it quite nicely analyzes things in terms of a seven-property model. However, the analytical model is not reflected in the declarative capabilities of XSD datatypes, and I think this is something that could be looked at.

It is all part of the issue of compound datatypes, which is what ISO DSDL Part 5 Datatype Library Language (DTLL) (warning: draft only) is supposed to address. The flaw with XSD Datatypes is not so much the type system (facets and type derivation by restriction seem innocuous) but the type generation system ("derivation" by list? come on...!)

What the EDTF people seem to be wanting is something even more than DTLL may offer: they seem to want some measure of fuzziness or wildcarding, for example


Some year between 2000 and 2099: 20??
year and month, questionable: 2004-06?
year and month, approximate: 2004-06~

This seems to be in the same trend as XBRL, in a way: the idea that whenever you have a count of something, you need to know the units and precision. One way to treat these would be as wildcards, another way would be that they were lexical shortcuts for setting some facets (I presume they are not properties), another way would be to say that these are actually ranges.


You might also be interested in:

1 Comment

DateTime is tricky because it is where markup meets a very human-centric issue, the notion of "when".


I think xsd:dateTime has always had UTC offsets, because ISO8601 has them. What it doesn't have is the ability to say "00:00, wherever you are"; you have to specify a timezone, you have to bound the time to a specific time on the planet's surface, rather than midnight round the planet

News Topics

Recommended for You

Got a Question?