Microsoft has lost its appeal on the Custom XML feature in Word 2007!
To prevent confusion, the removal only applies to one feature in Word 200x that no-one would be using casually on homemade or general office documents.
[UPDATE: MS' Gray Knowlton has a blog in which he says that it is a completely different feature affected: not the feature related to binding custom controls to an embedded XML file, but the feature that allows you to add the pink tags using a schema. Gray is certainly the person to know what is what here, but this makes absolutely no sense to me: where is the separation that is talked about in the patent? I'll post more when as we get info.]
I wrote previous blog entries on how the early press reports on various XML patents were sensational, on what the idea of the i4i patent seems to be, how the idea of external markup predates the patent (for example in Ted Nelson wrote about it), and how the idea seems obvious from a prior ISO standard HyTime (which is why I think the patent is junk.)
Note: the link given in those blogs to the patent now points to some other patent. Here is a link that works at the moment.
However, Microsoft's defense did not raise those kinds of issues. And, in fact, the appeal court disallowed most kinds of appeals about obviousness and prior art, because Microsoft's lawyers had not filed a pre-verdict judgment as a matter of law claim (i.e. the decision is made before it gets to the jury to decide the facts because the judge can tell there is not a legally sufficient evidentiary basis). So the appeals court could not look into those issues fresh, it could only look at whether it seems the right procedures had been followed. You can read the decision here.
So none of this challenges my view that the patent is junk: the idea of out-of-line markup precedes it (e.g. Ted Nelson), the idea of documents made by pointing into datasets precedes it (e.g. the HyTime standard), the patent misrepresents the status quo at the time by not referring to HyTime and dataloc in its discussion of SGML, the unclarity about whether the data store has markup, and so on. If there is a significant difference, it is not clear to me. The USPTO should not take this decision to alter any of these issues: they were not considered.
Tim's argument is that people shouldn't be making their own private vocabularies but using standard ones: it is a pretty unconvincing argument: first because the CustomXML feature can be used with standard schemas (e.g. I have done it using XBRL), and second because sometimes private schema are useful (Michael Kay, who has objections to sofware patents in any case, has a good comment on this on Tim's blog.)
Peter's argument is that using features that are only in one application locks you in to that application. (I am not sure CustomXML is entirely unique, by the way: I wonder whether ODF will have to look at whether this affects XForms as well.) But to say you should never buy or use any product because it alone has some particular feature that you need would be perverse in the extreme! While bogus as a general rule, it does make sense for the kinds of academic documents and use cases Peter is mainly concerned with, however.
(I do have another set of partial objections to Tim's and Peter's views too: they are wrinkles on the idea that competition on features or standards reduces short-term optimality but increases longer term optimality: it is the old 'command economy' argument I guess:— why would competitors add a feature if users refuse to use that feature in the pioneer because it is pioneering? And why would standards groups add that feature to an 'aspirational' standard if only one vendor supports it? All the attempts to grasp for interoperability by blanket bans on extensibility have a downside: they promote stagnation. Smarter approaches (such as CustomXML and MCE) are clearly better, but this decision really muddies the waters. )
Strictly, the I4I patent does not relate to XML in particular, but a way of using (or implementing) a particular feature: indeed, it relates to structures whether they are in a file or not. I guess you could see the patent as closer to the Infoset view of things. But we can limit it further. First it is about documents which have structure and content. Next, the appeal says that (jury found that) the patent does not apply to systems that use merely trees and pointers: a tree is apparently not a "metacode map" and a pointer is not an "address in use".
It will be interesting to see if this represents an inflection point. Microsoft has to remove CustomXML support, so I expect that IS 29500 would have it removed. I expect that ODF will have to have a good look at XForms support, as will W3C: I don't know how on earth they have enough information to come to any good conclusion, either from the patent or the judgment: they may be best off getting a free license from I4I (if I4I would give it) or some formalized agreement that the I4I patent does not apply, to be sure; I am very interested in finding out from any XForms person why this decision doesn't also apply to XForms, which would spell its end. And what about RDF: it is external markup too? And does it apply to some AJAX systems? All this is another example of the stifling effect of software patents: a scourge on the industry and a market distortion.
CustomXML and XForms have been important ingredients in the quest to separate data from markup, and Microsoft had been building their systems on it. This is in part to ramp up Word as a general user interface able to compete with Web browsers for convenience. It will be interesting to see what they do now: presumably just directly embedded data (confusing the presentation and data) and some kinds of transforms to import and export. Maybe the XML team will be on nose at Microsoft now: more likely, it will increase the drive to make Word a proper structured editor based on the existing embedded XML feature (not the CustomXML), but allowing it to drive styles and so on.
There is a question closer to home too.
Supppose I have a Schematron schema which has a variable containing a chunk of arbitrary XML marking up constants to be used in the schema. Currently the way to do this is to use the document function, but we have been talking about allowing XSLT-style variables with the value taken as the content. Now if I have variables that locate various fields of interest in that data, are these variables a "metacode map"? (The same issue applies to XSLT as well, I expect.)
Even if patents are allowed on software, it should be impossible to patent what boil down to fundamental software engineering techniques such as Information Hiding (e.g. removing constants to headers), no matter how they are dressed up.
If I were king of the world, I would just ditch the entire US (and international) patent system. It does not seem to produce a clear net benefit, compare to the market distortion. The pattern of development might change, but development would still go ahead. Microsoft can afford the fines. But we, the public, should not have to fork out to support monopoly profits: whether the monopoly is from market domination or artificially created by patents and supposed IPR.