Microsoft and the two XML patents

By Rick Jelliffe
August 12, 2009 | Comments: 12

Microsoft has been in the news in the last month in relation to two patents, one it received and one it has been ordered to pay $200 million in damages for infringing.

I've been looking through both, and the patents seem to bear little resembles to their reports.

First is their new 2009 patent 7,571,169. ZDNET reports on this

On the face of it, the patent would appear to cover all usage of XML and XSDs in word processing document, which would effectively leave all other modern word processors - and other software that used their documents - liable to licensing by the company.

Waaaah? ODF too? .....really?

But when you read the patent, you can see it actually covers a much smaller class of documents, As I read it (IANAL):


  • The whole thing must be a single XML file. Not SGML. Not HTML. Not a bunch of XML files in a ZIP archive.

  • AND It has to use XSD. Not RELAX NG, not Schematron, not DTDs.

  • AND It can only use element content. Not mixed content.

  • If there are bookmarks, they use separate elements for start and end. Not a single element.

It then lards on various additional features, for example to say that a style information can include font information. It seems like a defensive patent.

So this would seem to me to be a patent on the structures found in the now obsolete Office 2003 XML formats. That such a thing can be patented is a really retrograde sign, but it does not seem to remotely correspond to the explanation of the ZDNET reports.

The second patent is more sensational. It concerns a 1998 patent 5,787,449. So what does ZDNET say about this one?

Microsoft is barred from selling any Microsoft Word products that can open XML files (.xml, .docx and .docm), according to a U.S. District Court ruling

But when we look at the actual injunction we see that the judge bans MS from (emphasis added):

selling, offering to sell, and/or importing in or into the United States any Infringing and Future Word Products that have the capability of opening a .XML, .DOCX, or .DOCM file ("an XML file") containing custom XML;

though the injunction allows files to be opened which strip out customXML.

I don't know enough of the details of the case to say much about in what way Word is supposed to have infringed the patent. But custom XML is where there is some arbitrary non-OOXML XML document as a part of the OOXML (ZIP) file and then the main OOXML document uses an XPath locator to locate various elements or attributes in the XML document, and to use the values of those elements or attributes. So when you edit the data, the data in the custom XML file is edited as well as the cached version in the OOXML file.

This feature allows Word to be used as a general purpose XML editor, for documents which are susceptible for being edited as forms. I don't think that OOXML's system is that much different from using XForms, so the ODF people will be looking at this too. It is part of Microsoft's medium-term drive to make Word a viable interface for forms data, such as hospital records, as the front-end for various back-end products that exchange data with is. Data-as-documents.

The patent seems to be about out-of-line markup and indirect addressing of a string table. You read in a marked-up document and split into 1) a list of content strings and 2) a map giving the codes for that string and information to allow addressing of the strings. You can then have multiple maps and various other practical advantages. (Actually, I find it hard not see why a hash table wouldn't offend this patent, but I suppose things don't work like that. And SGML's data attribute feature can be use for this kind of thing, can't it?)

[UPDATE: For a fuller description of the patented technology, see Part 2 on this blog.]

So it is unclear to me why it is OOXML's custom XML in particular that infringes this patent: it looks like an implementation technique. The world is so full of mystery. I'll have to troll around and try to find more details of the suit, I suppose. Reader hints welcome as always.

This will undoubtedly have some minor flow-on effect in IS 29500 (which may be just to remove the bits about customXML until the patent expires (in 2015) . (The mid-90s period that the patent was granted represents of course the nadir of competency at the USPTO, with those of us in the rest of the world, particularly in standards bodies, scandalized at the kinds of things being allowed.)

But it looks like I will have to add ZDNET to the list of unreliable sources (just up from ComputerWorldNZ and Slashdot.) The US ComputerWorld has a more realistic piece on it: Injunction on Microsoft Word unlikely to halt sales.


You might also be interested in:

12 Comments

Update: Front page on ZDNET as I write:

'Custom XML' the key to patent suit over MS Word

Rick,

Very nice analysis. My reading concurs largely with yours, and it rather begs the point - you have someone who was familiar with SGML and wrote a patent that effectively predated XML by a few years, though its likely that there were SGML editors that existed at that time.

At the very least, this raises whether there should in fact be a statute of limitations on patents - that once a technology is formally established, if you do not in fact claim infringement of a patent until several years after the patent is granted that said patent become null and void. This would both prevent the accumulation of defensive patents (I'm still unsure whether in fact i4i was the original patent-holder or a derivative holder) and would keep the current holder of a patent from jeopardizing an entire industry and holding them hostage for hundreds of millions or billions of dollars of damage.

This smacks of someone who's great-great grandfather had a deed on land that was passed down to his great grandson, essentially stating that his gr-father owned the land on which a mall had been developed. The grandson then goes to a judge with the claim that, by right, the mall and its revenues are his. It's possible that this person may be due minimal damages, but given that the grandfather, the original owner to the land patent in this case, failed to assert the rights of that patent, the value of the derivate rights should be minimized accordingly.

I commented on the first patent in my own blog. Overall, I would think that the provisions within the W3C (to which Microsoft is a signatory) would make it impossible to patent a specific XML format because it violates encumbrancy. However, I realize that this is not the currently accepted legal view. Perhaps after i4i, this will make Microsoft's lawyers really look hard at the advisability of not releasing 7,571,169 until the public domain.

I do not believe these two events were unrelated.

Kurt: The trouble is that the judgment in the recent case takes such a ludicrously maximal interpretation of every term that it becomes impossible to say what any patent means.

For example, the first patent talks about XML files, not XML-in-ZIP. But if you take an interpretation where you will accept any definition that happens to include the words "XML" and "file", then XML-in-ZIP would count too.

Software patents are bad enough, but patents interpreted by morons with no brakes is a disaster.

Kurt: The trouble is that the judgment in the recent case takes such a ludicrously maximal interpretation of every term that if the same approach were taken with other patents it becomes impossible for any developer to say what any patent actually means.

For example, the first patent talks about XML files, not XML-in-ZIP. But if you take an interpretation where you will accept any definition that happens to include the words "XML" and "file", then XML-in-ZIP would count too.

Software patents are bad enough, but patents interpreted by morons and suckers is a disaster.

All of the mechanization is unpatentable as far as I can tell. What's wrong with the patent office? Techniques virtually identical to these (hashes, translate and embed separate format, use links) have been employed in computer science since the 70's. You say it's applied to a unique format? That would imply I could apply standard techniques to a new language (say spell check Urdu) and patent the approach.

Dave: Yes, whether the issue is how data is represented in an external format or how a program is organized internally, it seems pretty obvious.

Perhaps all of us who have been working in markup are so jaded and brain-damaged that we are unable to see the hidden novelty that stood out so much to the USPTO.

Rick: The US patent office is set up to judge physical inventions, not ideas, software, or business processes. So even though many patents appear to be issued in violation of the constitutional justification for patents (see http://lnxwalt.wordpress.com/2009/08/01/copyright-as-presently-defined-is-unconstitutional/ but remember that I'm not a lawyer), they issue them anyway. Software is a specialized field, and I would be very surprised if they had any internal experts that could consult with the examiners who are determining whether to issue the patent.

I think I am going to file a patent on operating an organization in such a way that it receives at least as much revenue as it has in expenses. That way, I can go after profit-making enterprises and force them to pay me license fees.

Walt has a nice line in his blog

If software is female, WordPerfect is the French maid, OpenOffice is MaryAnn (Gilligan’s Island) and MS Office is the Russian weightlifter. I know which one I’d avoid.

Kurt Cagle wrote: "someone who was familiar with SGML and wrote a patent that effectively predated XML by a few years, though its likely that there were SGML editors that existed at that time."

I recall reading books about XML in 1998, so this can't "predate XML".

I prefer Nisus Writer, Xcode, QUED/M, TextEdit, vim, EdiF, Xedit... anything but Msword, but without more information this patent seems totally bogus, like IBM's patent on parsing words using whitespace as a separator.

Jgo: I think it was filed before XML was started.

It' just another example of 'Microsoft' bashing.
Those fools in Canada want to stifle innovation. XML, and all of it various uses, should have been interpreted as already being in the 'public domain' long before those idiots tried to patent it's usage.

Analogy. I have a Ford tractor. Little 'piss-ant' company patents the usage of diesel fuel in a tractor.
Since Ford sells tractors that burn diesel fuel should they now have to pay a royalty to the 'piss-ants' for using diesel fuel in their tractor? Even worse, should I have to pay them a royalty for using diesel?
Ridiculous and so are these patents.
It should have been thrown out of court and the piss-ant company should have been made to pay ALL of the associated count and legal costs.

The i4i patent was filed in 94 and published in late 98 to work with SGML, XML came out in 98 but was based on SGML from the 80's.
I not sure if patent should have been issued but it was like many others in the 90's but why wait 10yrs to go after XML unless they needed someone with deep pockets.
How many more patents are out there that could be used to make alot of money with the right lawyers in the right court?
Software patents in general are a problem since new file formats open the door for news patents using old ideals with only a same change.

News Topics

Recommended for You

Got a Question?