Key Fraunhofer study released on ODF and OOXML

By Rick Jelliffe
August 21, 2009

The Register has a copy of the Fraunhofer Institute's white paper Document Interoperability: Open Document Format and Office Open XML in PDF. It is likely to be a key document for evidence-based government procurement processes, and a good counter-weight to the vendors and the partisans.

This a systematic stab at figuring out how much translatability there is between ODF and OOXML: for all the talk about ODF and OOXML over the years, there has been very little objective evidence. The best there has been is the various implementers' notes online, as well as discussion from various pundits or parties. The white paper deals with the current ISO versions of the standards, not implementations or non-ISO updates, which is a big caveat.

The conclusion is suitably vague:

It may be concluded that many of the functionalities, especially those found in simpler documents, can be translated between the standards, while the translation of other functionalities can prove complex or even impossible.

and

Thus, before choosing the format of the document and the tools, it is essential to be aware of the proper reasons why the exchange of documents is necessary and what requirements are needed for the translation.

But Herren Ziesing, Ishionwu and Dr. Eckert do rule out the extremes: that ODF does everything that OOXML does or that OOXML does everything that ODF does on the one hand, or that the two formats are so dissimilar harmonization/convergence/accomodation/mapping is impossible even to pursue on the other. And they avoid the issue of whether individual features are desirable or the best way to do whatever they do.

It is particularly striking how little would seem to be involved in getting the two spreadsheet systems well intereroperable (putting aside formula issues.)

But the white paper does make a valuable point:

As the rules used for transformation are not standardized however, each application is allowed to use its own specific rules. Under certain circumstances specific rules can neglect certain properties and make specific assumptions which could enhance translatability.

The context of this is that Germany (and Fraunhofer) has been at the forefront of wanting a clear official mapping between ODF and OOXML. There is a working group at ISO/IEC JTC1 SC34 called WG5 whose job this is. But they have found it such a big task, they are taking a bite at a smaller cherry first, which is to scope out the limits of interoperability, for fear of drowing in a sea of details.

I suppose this report might be good news for typefi, which makes a tool to fix up document formatting to fit in with a corporate style, learned from a corpus. That would fix certain classes of round-tripping errors.

I was surprised to read an ODF developer recently say that round-tripping was not an important use-case: a document will tend to be written out in the same extension that it came in as. But now that formats closely track the features implemented by different products, so rather than seeing the list in terms of format mismatches, we also have to see it as a wish-list of features: that implementations will need to support their rivals features in order to make the users happy has not changed. What has changed is that now some of the demand for new feature support is going towards Microsoft as well. ISO ODF is a lens to focus the feature lists of Office's rivals and to encourage MS to support it (the encouragement may come at the hands of government procurement offices, of course.)

But I think there is an approach that would short-circuit many of these boring finickity interoperability shortcomings. That is for ODF and OOXML to retain as much markup from the imported document as possible.

Take themes for an example. I think I have pointed this out before, but the Fraunhofer report mentions that Office uses themes (parameterized styles, that can be swapped readily, such as for presentations) which ODF does not support.

So translating an OOXML document with themes to ODF may look exactly the same, but when round-tripping back to OOXML, the themes information would have been stripped out, and there would just be normal styles. A definite maintenance issue for some corporate users, but not remotely a Ma and Pa issue.

Rather than strip the themes out from the incoming OOXML, it would be better for an ODF implementation to retain the theme parts, and round-trip them even if it generates ODF. So that the OOXML system that imports the ODF can choose to use the original OOXML version. The user could have a dialog box "Use the original styles and themes?"

OOXML has a great mechanism that it can use for round-tripping ODF better: MCE. ODF would be better off adopting MCE, or an analogue: its current mechanism of arbitrary embedding is not really good enough.

I think the smart money is that the standards will head towards this kind of multiple format document, simply because it is the least disruptive for application developers.

To a great extent, the so-called format wars were actually feature wars (see this), and as is so often the case, the decision about which features to support in a format is an artificial one that springs out of having to make a choice: the lack of support for plurality almost guarantees acrimony, and winners and losers. But I don't see that a file format needs to be burdened with features merely to have feature parity with a rival file format: it is much more straightforward, to merely pass the rival formats' features through undigested and plonk them in the output file with minimal change. This in particular applies to styles, numbering, schemas, metadata and themes, which can have every chance of not being updated by casual edits as the document does the rounds. Content has a different set of trade-offs and considerations.

By way of comparison, I suppose an alternative would be for Word, on opening an ODF document, to look to see if the document has style names that match a theme, and then to prompt the user whether they want to restore the original themes. That has a certain attractiveness to it as well, but it doesn't solve the larger problem, which won't go away.


You might also be interested in:

News Topics

Recommended for You

Got a Question?