Microsoft has been in the news in the last month in relation to two patents, one it received and one it has been ordered to pay $200 million in damages for infringing.
I've been looking through both, and the patents seem to bear little resembles to their reports.
On the face of it, the patent would appear to cover all usage of XML and XSDs in word processing document, which would effectively leave all other modern word processors - and other software that used their documents - liable to licensing by the company.
Waaaah? ODF too? .....really?
But when you read the patent, you can see it actually covers a much smaller class of documents, As I read it (IANAL):
- The whole thing must be a single XML file. Not SGML. Not HTML. Not a bunch of XML files in a ZIP archive.
- AND It has to use XSD. Not RELAX NG, not Schematron, not DTDs.
- AND It can only use element content. Not mixed content.
- If there are bookmarks, they use separate elements for start and end. Not a single element.
It then lards on various additional features, for example to say that a style information can include font information. It seems like a defensive patent.
So this would seem to me to be a patent on the structures found in the now obsolete Office 2003 XML formats. That such a thing can be patented is a really retrograde sign, but it does not seem to remotely correspond to the explanation of the ZDNET reports.
The second patent is more sensational. It concerns a 1998 patent 5,787,449. So what does ZDNET say about this one?
Microsoft is barred from selling any Microsoft Word products that can open XML files (.xml, .docx and .docm), according to a U.S. District Court ruling
But when we look at the actual injunction we see that the judge bans MS from (emphasis added):
selling, offering to sell, and/or importing in or into the United States any Infringing and Future Word Products that have the capability of opening a .XML, .DOCX, or .DOCM file ("an XML file") containing custom XML;
though the injunction allows files to be opened which strip out customXML.
I don't know enough of the details of the case to say much about in what way Word is supposed to have infringed the patent. But custom XML is where there is some arbitrary non-OOXML XML document as a part of the OOXML (ZIP) file and then the main OOXML document uses an XPath locator to locate various elements or attributes in the XML document, and to use the values of those elements or attributes. So when you edit the data, the data in the custom XML file is edited as well as the cached version in the OOXML file.
This feature allows Word to be used as a general purpose XML editor, for documents which are susceptible for being edited as forms. I don't think that OOXML's system is that much different from using XForms, so the ODF people will be looking at this too. It is part of Microsoft's medium-term drive to make Word a viable interface for forms data, such as hospital records, as the front-end for various back-end products that exchange data with is. Data-as-documents.
The patent seems to be about out-of-line markup and indirect addressing of a string table. You read in a marked-up document and split into 1) a list of content strings and 2) a map giving the codes for that string and information to allow addressing of the strings. You can then have multiple maps and various other practical advantages. (Actually, I find it hard not see why a hash table wouldn't offend this patent, but I suppose things don't work like that. And SGML's data attribute feature can be use for this kind of thing, can't it?)
[UPDATE: For a fuller description of the patented technology, see Part 2 on this blog.]
So it is unclear to me why it is OOXML's custom XML in particular that infringes this patent: it looks like an implementation technique. The world is so full of mystery. I'll have to troll around and try to find more details of the suit, I suppose. Reader hints welcome as always.
This will undoubtedly have some minor flow-on effect in IS 29500 (which may be just to remove the bits about customXML until the patent expires (in 2015) . (The mid-90s period that the patent was granted represents of course the nadir of competency at the USPTO, with those of us in the rest of the world, particularly in standards bodies, scandalized at the kinds of things being allowed.)
But it looks like I will have to add ZDNET to the list of unreliable sources (just up from ComputerWorldNZ and Slashdot.) The US ComputerWorld has a more realistic piece on it: Injunction on Microsoft Word unlikely to halt sales.