I've been looking at the basic organization of the current generation of XML-in-ZIP formats. A couple of weeks ago I put up Page models and geometries of ODF, IDML and XFL and today I have been looking at Adobe's PDFXML (a.k.a. Mars), the ECMA/Microsoft XPS and the W3C/Canon's SVG Print.
I originally was planning to compare them because I thought they were in roughly the same space: people had told me that XPS was Microsoft's attempt to gain control in the PDF space, while MARS was Adobe's attempt either ward off XPS or to have an iron in the fire in case they needed to have an XML binding ready (I suppose ultimately as part of the ISO PDF effort.)
But when I looked at the structures in the ZIP files, and what they do, I don't see that they are exactly addressing the same problem. XPS seems to sit half-way between IDML and PDFXML.
XMLPDF (which was code-named Mars is page-oriented and presents the set of pages as SVG graphics. Each page has another information XML document, and potentially a document containing annotations. At the root there is a file
backbone.xml which looks like this (from the sample files):
<?xml version="1.0" encoding="UTF-8"?>
<!--The PDFXML format is preliminary and subject to change-->
<PDF PDFVersion="1.6" Version="0.9.0" PageLayout="OneColumn"
<LetterspaceFlags Value="0" type="integer"></LetterspaceFlags>
<Bookmarks src="/bookmarks.xml" />
<Metadata src="/META-INF/metadata.xml" />
<Page src="/page/0/info.xml" x1="0" y1="0" x2="612" y2="792" ID="2" />
<Page src="/page/1/info.xml" x1="0" y1="0" x2="612" y2="792" ID="0" />
<Page src="/page/2/info.xml" x1="0" y1="0" x2="612" y2="792" ID="1" />
<Role Name="Body Text 2" MapTo="P" />
<Role Name="InlineShape" MapTo="Figure" />
<Role Name="DropCap" MapTo="Figure"/>
<Role Name="Outline" MapTo="Span"/>
<Role Name="Subscript" MapTo="Span"/>
<Role Name="Superscript" MapTo="Span"/>
<Role Name="TOA" MapTo="TOC"/>
<Role Name="TOF" MapTo="TOC"/>
<Role Name="Strikeout" MapTo="Span"/>
<Role Name="Table Text" MapTo="P"/>
<Role Name="TextBox" MapTo="Div"/>
<Role Name="Heading 1" MapTo="P"/>
<Role Name="Heading 2" MapTo="P"/>
<Role Name="Normal" MapTo="P"/>
<Role Name="Endnote" MapTo="Note"/>
<Role Name="Footnote" MapTo="Note"/>
<Role Name="Underline" MapTo="Span"/>
<Role Name="TOFI" MapTo="TOCI"/>
<Role Name="Frame" MapTo="Div"/>
<Role Name="Shape" MapTo="Figure"/>
<Role Name="TOAI" MapTo="TOCI"/>
<Role Name="Normal (Web)" MapTo="P"/>
When Adobe say Mars is preliminary, they are not kidding. They don't even have a schema or basic documentation available at their website (unless it comes bundled in some obscure package.) Adobe is very much the driver, but they seem to have thrown away the car keys: there is no sign of recent activity. Adobe has been very private in its enthusiasm for SVG since it acquired Flash: hot potatoes spring to mind.
This lack of information doesn't mean we don't know what XMLPDF is: we can let the name set our expectations. Just as PDF is an output-oriented format with limited capabilities for re-purposing and limited capabilities for alteration and structural representation, so is PDFXML.
XPSXPS is also page-oriented: it goes as far as calling them fixed page which should set our expectations. It uses the Open Packaging Conventions (OPC) and Markup Compatability and Extension (MCE) from the OOXML standard. It doesn't use SVG but its own elements for canvas, path and transforms, putting it into the SVG/PDF space. It has a lot attention to colors and character spacing, which puts means it is targetted at supporting industrial quality: you would expect that this is something that PDFXML would eventually get if it were to be completed.
But three variations from PDFXML caught my eye:
- The first is that XPS has hooks for a print ticket system. So it is aimed at being being able to deliver print jobs to printing houses. This certainly puts it in the PDF arena. It doesn't define a ticket language however: is that something that is supposed to be a B2B issue? Or just a matter for a different standard and a different day?
- The second is that XPS defines a basic generalized display-oriented language to represent document structures. We are used to the idea of a system retaining the outlines for a TOC, but XPS also retains information about each block of text: it retains information on paragraph boundaries, lists and tables. It is external markup that points to blocks on the pages.
PDFXML has a clunky-looking system where an SVG page image can have a
struct.xmlwhich provides a mapping list that attaches role labels (generic identifiers?) to elements in the SVG, using a kind of tumbler system
1/8/9/4. It is not clear to me why this could not have been done inside the SVG file and extracted as needed: that it uses fixed indexes is a real "read-only"-ism.
- And the third is that these structural units are organized as distinct stories. Like IDML, a page may have multiple stories, and a story can have multiple fragments on page (e.g. several columns) and go between multiple pages. This is an interesting thing to provide, because it suggests that XPS is much more suitable for
Here is the top-level structure (from the ECMA standard)
And here is what a story fragment looks like:
<StoryFragments xmlns="http://schemas.openxps.org/oxps/v1.0/documentstructure"> <StoryFragment StoryName="Story1" FragmentName="Fr1" FragmentType="Content"> <StoryBreak /> <ParagraphStructure> <NamedElement NameReference="Block1" /> <NamedElement NameReference="Block2" /> </ParagraphStructure> <StoryBreak /> </StoryFragment> <StoryFragment StoryName="Story1" FragmentName="Fr2" FragmentType="Content"> <StoryBreak /> <FigureStructure> <NamedElement NameReference="Block8" /> </FigureStructure> <ParagraphStructure> <NamedElement NameReference="Block9" /> </ParagraphStructure> <TableStructure> <TableRowGroupStructure> <TableRowStructure> <TableCellStructure> <ParagraphStructure> <NamedElement NameReference="Block10" /> <NamedElement NameReference="Block11" /> </ParagraphStructure> </TableCellStructure> <TableCellStructure> <ParagraphStructure> <NamedElement NameReference="Block12" /> </ParagraphStructure> </TableCellStructure> </TableRowStructure> </TableRowGroupStructure> </TableStructure> <StoryBreak /> </StoryFragment> </StoryFragments>
So neither XPS nor PDFXML has enough information by themselves to reconstruct an original XML stucture including attributes and so on. But it looks like the strategies for editing are that with PDFXML you get labels that might be helpful to guide the generation of HTML from the file, while with XPS you just get to label a simpler set of block structures (still helpful for smarter cut and paste), and a better set of hooks for building publishing systems on top of XPS. I don't imagine either are sufficient for quality reconsrutuction of XML: equations, ruby, footnotes, references, and so on.
I expect they are primarily there for outlines and accessibility. (Accessibility means providing enough extra information as standard part of the file to allow aome alternative renderings and navigation, particularly to prevent people with certain kinds of atypical abilities from unnecessarily being penalized: to a certain extent, it is technologies that create or remove disabilities, not the condition itself. These pages at O'Reilly have a screen reader service, in case you didn't notice it, by the way. I have asked to be able to select voice characters based on markup, but it is no go yet: I think it would help with block quotes especially.)
So we can remove XMLPDF from the picture, as not being in a serious state. In one corner we have ISO/Adobe PDF and Adobe IDML, in the other corner we have Ecma/Microsoft XPS. And in a third corner we have W3C SVG Print 1.2 which seems (Aug 2009) stuck in draft at W3C: Part 2: language and Part 1: Primer
SVG PrintSVG print is page-oriented too, but with a master page system. It seems particularly associated with Canon (actually, the development seems to come out of Sydney, so if anyone there wants to contact me, I'd love to catch up on the status and talk through some possibilities!)
<svg xmlns="http://www.w3.org/2000/svg"> <!-- Default Background Master Page definitions go after the svg element and before the pageSet --> <defs> <!-- definitions here are available on each page --> </defs> <!-- graphics here are visible on each page --> <pageSet> <masterPage rendering-order="over"> <!-- graphics here are visible in the foreground on each page --> <text x="20" y="20">This text is on the top of every page.</text> </masterPage> <page id="circle_page"> <defs> <!-- These definitions are local to this page. --> </defs> <!-- graphics for page 1 go here --> <circle cx="100" cy="100" r="20" fill="blue"/> </page> <page orientation="90" id="rectangle_page"> <!-- graphics for page 2 go here --> <rect x="100" y="100" width="20" height="20" fill="red"/> </page> <page noPrint="true"> <!-- This page is not printed --> </page> </pageSet> </svg>There is a graphic in the Primer about the relationship with XSL-FO that is handy.
SVG print also allows job tickets and is neutral with regard to format.
Where do they fit?There should be no surprise we have three standards in the pipeline for a similar thing. They each come from the dominant ecosystems in the publishing world: Adobe, Microsoft and Open Source (i.e. everyone else.)
When talking about standards, people often blather that it would be much simpler if everyone adopted only one standard. However, interconvertability between standards or exchangeability between applications of the same class is not the only efficiency factor or source of network effects operating. As well as horizontal exchange there is also vertical exchange: if we all adopted PDFXML with exchange/print ticket capabilities that would be good for PDF ecosystem developers, but it would not make life easier for SVG ecosystem developers.
It is better probably for SVG's survival for it to add page-oriented capabilities, so that it is not excluded from areas that could inject vitality into it. (I don't know whether page features would do this: certainly mobile phone features have though, with SVG Tiny.)
Plurality, againI suspect that the situation with SVG Print, PDF and XPS is the same as with OOXML and ODF: the route to convergence may not happen at the level of markup harmonization at all, but instead by the support of plurality. By allowing a file that is SVG and XPS and PDF at the same time.
In this view of things, we should be looking at how to organize the ZIP packaging with conventions to allow maximal sharing of media and metadata, selection of alternatives by an application, non-interference between alternate versions, and knowing how an application acts when an alternative is stale.
For example, what concrete declarations conventions do we need to have the JAR style
META-INF folder in an OPC document? I have great hopes and no expectations that the ODF TC, which is slated sometime to look at ODF's packaging conventions, will take some initiative in this regard: it would be great if they and ECMA could liaise enough to get ODF's requirements put into OPC and MCE (IS 29500 OOXML Part 2 and Part 3), and for ODF to then adopt OPC and MCE.
(I don't necessarily mean for ODF to adopt indirect addressing through OPC relationships, or necesarily all of MCE, nor is there any reason why OPC and MCE are so fixed in stone that they couldn't be augmented or profiled to suit ODF.)
I think we have to be much smarter in how we think about standards. It is easy to think about them in terms of agreements or libraries or mandates: all planned and directed activities. But I don't think that will work in this kind of case, where there are multiple, rival technical ecosystems. They don't want to agree. They have fans who don't want to look at alternatives. And they all are probably open enough to squeeze under the rosy gate of legitimacy from a public policy view. So to get convergence we need some other strategy.
I suggest that the smart thing to do is to set up a pluralistic infrastructure that reduces the barriers to incremental convergence. For example, take metadata. Adobe has their XMP, that uses RDF and Dublin Core: IDML and PDFMXL support it. ODF is adopting an RDF approach (they have a wider meaning for metadata, if I understand it correctly, to mean all sorts of extensions) that also includes Dublin Core. SVG obviously could be augmented with RDF, being a W3C family technology. And OOXML supports Dublin Core. I don't see what metadata XPS supports, but it does require that applications allow XMP metadata in JPEGs (APP1).
With a compatible ZIP infrastructure, there would be very little in the way of then adopting a common (or common-enough) metadata approach. No-one would lose control, because they were all doing the same kind of thing anyway. But it wouldn't require a branding divorce, or large re-programming, or disruption of features sets, or change in direction for anyone. At some point, if things panned out well, it would just become technically easier to do the same as everyone else than to be different.
I've looked at packaging preciously in Packaging formats of famous application/*+zip. Without some kind of push for an enabling standard in this area, individual standards efforts have no choice but to fragment things further.