Super-styling: Are our current page-breaking hints too low-level for acceptable interoperability?

By Rick Jelliffe
December 1, 2008 | Comments: 2

In the past, we could neatly categorize publishing systems into perhaps four big classes:

  • quality or batch typesetting (book typesetting) such as 3B2, JADE or TeX,
  • word processing (WP) such as Word Perfect or AbiWord,
  • desktop publishing (DTP) such as PageMaker
  • report generators such as Crystal Reports
Recently many products have swapped over into some new arrangement that I haven't been bothered to categorize: for example, Adobe In-Design handles the design-intensive pages that we associate with DTP but allows batch runs, so it can handle many of the report generator and typesetting tasks too. The old categories are not as compelling to the market any more, and so the products are getting more diffuse.

One of the big drivers is of course the advent of XML. It profoundly disentangles the input from from the middleware, in the same kind of way that PDF disconnects the output from the middleware. Similarly, the adoption of XML-in-ZIP formats by WP applications also grows their congeniality to uses beyond word processing.

I point out these different classes because there is a great difference in the capabilities that each of these kinds of systems support. The one I am particularly interested in is page-breaking behaviour: or, at least, the settings or hints that some software uses to know when a page is full and a new page break needs to be done.

Break early, break often



In the past, this was easy: DTP applications had no idea of pages: if the block was full, it ran off the screen, however within a page there was great freedom for object placement.  WP systems (including the early batch ones, such as troff) had quite simplistic mechanisms: typical was a needs system (in troff .ne and in Scribe  @needs) which specified that if there was less than a certain amount of space left on the page, a line break should be generated.

The higher quality typesetting systems have much more sophisticated methods available. This includes both programmatic (most high quality typesetting systems are actually languages so that complex requirements can be coded) as well as some slightly higher-level declarative methods. These are often expressed as properties of text blocks or even (in structural tools) as properties of containers of text blocks (think of HTML's div element,)

These properties include various mechanisms for binding one paragraph to another: for example, a keep-with-next property on a heading so that if the paragraph following the heading would be pulled onto the next page, the heading should go with it.  These interact with various other mechanisms: some systems have keep-together properties on text blocks, so that the block should not be broken accross pages; some systems have widow and orphan controls, which prevent paragraphs breaking  to leave a single line at the top of a page or at the bottom of a page, for example. TeX, for example, even allows an interaction between hyphenation and  page-breaking, so that a page does not end with a hyphen.

Word processing application frequently typically had very limited capabilities for these properties. Their user model was someone typing a document and doing all the work by hand, while the complex typesetting systems had a batch model: if you are producing a 1000 page book every day, you don't want to have to go through each page individually, you want the software to make a good stab at it. Some typesetting applications even logged warning messages to inform the typesetter about page breaks that were considered bad practice.  Microsoft Word is typical of older WP applications in that it provides only the most basic facilities in this, certainly too basic to be very credible for quality publishing where space is a consideration.

Systems which use needs or keep-with-next  tend to be profligate with space: the more that the content varies in unpredictable ways, the more chance that the pagination will go haywire. Think about the following edge-case: a page in a book where the last word in a chapter could not fit into a left-hand page, resulting in an almost empty pair of pages.

Industrial typesetting

But there are other use cases where these page breaking issues become pressing. For example, there has long been a requirement in many military publishing standards for technical or operational manuals, that all the material belonging to a warning section should be kept on the same page: consider the sentence "Snip the red wire<page break> only after snipping the blue wire. Failure to observe this sequence will result in detonation." And sometimes that a warning should be kept on the same page as its reference.

Furthermore, when dealing with the area of multi-column printing of topic material of large size, such as telephone directories, databooks and catalogs, the issue of optimal packing of pages can be crucial: no page should be wasted. In the case of yellow page directories, for example, typesetting can involve moving display sections backwards and forwards a few pages to get the optimal fit, even if it means the material is actually adjacent to and prior to the heading it belongs under. (I know one system that does a four page look-ahead in order to get optimal layout.)  This is "floats" gone mad, very far from conventional Word Processing capabilities.

And there are in fact multiple different mechanisms for specifying page breaking behaviour, even within each mechanism different systems provide different levels of control: a system may support orphan control but no widow control, for example.  This issue crops up elsewhere: when Tim Bray said we don't need two different ways to say bold, I think he was talking about the fact that different fonts implement boldness differently, and it was better to hide this than expose it: for example, consider a demi-bold font, should it be treated as a completely different font (this is how OOXML would represent it: it copes with bold or non-bold but not intermediate stages) or with an attribute (XSL-FO has a font-weight number, and ODF punts and provides both: the font-adornment attribute for locating a particular font and "bold" or "normal" or a reasonable selection of numbers from XSL-FO.)

One of the challenges for ODF/OOXML convergence (let alone OOXML/CSS convergence, UOF/ODF convergence, etc) is that figuring out the best strategy to cope with these things is not simple. On the one hand, merely kitchen-sinking (adding everything) adds a big burden to developers; on the other hand, keeping things simple would dumb-down the format and cause more repagination disruption than necessary. And a middle way of supporting some convenient intermediate set may not avoid the problems of the extremes.

Super-styling



Recently I have been considering whether our ideas of page-break styling, derived as they are from the mechanisms of quite ancient WP systems, could be replaced (augmented? superceded?) by some more high level styling concepts that would allow greater mechanism-independence for typeset output. This has been prompted by several jobs over the years and recently.

To give the basic idea, I derive three properties which can be applied to any element:

  • Announcement which is the need for some element to be graphically separate from its surroundings. This has two flavours: starting and ending.
  • Cohesiveness is the need for an element to be on the same page, column or line. I'd distinguish three flavours of this, initial cohesiveness (first page, column, line, etc), final cohesiveness, and general cohesiveness.
  • Relevance is the need for an element to be exactly in sequence. I'd distinguish to flavours, forward and backward, which allow floating scoped to progressively more siblings and ancestors.

The idea would be that there is a fairly concrete correspondence between settings in of these properties and the various keep-with-next, widow/orphan, soft/hard page-break etc mechanism of the various typesetting systems. So a document that was specified using these super-styles would be retargetable between typesetting systems that were very different, without raising the bar for vendors too much in terms of their mechanisms.

For example, an chapter element would have very high starting announcement (to force a page break to the recto page), and a very high end cohesiveness (to help that the last line was not a widow.) A section would have medium starting announcement (no forced page break unless it was fairly low on the page) but high initial cohesiveness so that the heading was not orphaned at the bottom of a page but carried with it.  A military warning would have low announcements but high cohesions and high relevances. A floating diagram could have low relevances, high cohesions (for the caption) and medium announcements.

Run-in headings are headings which are part of the same text block as their following paragraph: they are a little tedious because so many WP systems do no support the idea of consecutive elements being in the same block; so you often have to transform the data so that the element is shifted into the subsequent paragraph. In terms of superstyling, we can say a running heading is where there is medium initial-announcement but no or negative end announcement.   

What are the advantages of this kind of super-styling? The simplest is the mechanism independence. And, in theory, the properties could be reverse engineered from existing styles and format properties. It could reduce the need for every system to do everything, by increasing the transposition of information from one mechanism to another, to some extent. 

But it gives two other good advantages. The first is that it provides an abstract style that can be applied even to non-graphical rendering. For example, consecutive element with high announcement that are fairly high up in a document tree might be split up into separate HTML pages (or separate tabs in a user agent) automatically. Elements that have low relevance could similarly be factored into other pages or divs in an HTML conversion. An element with a high announcement could be preceded by a suitably dramatic pause or ping in a speech synthesizer.

The second is that it gives a mechanism by which bad page breaks can be identified, even when being used by systems with only primitive mechanisms. For example, to track down whether a big chunk of whitespace on a pages is acceptable or not. This is something that is perhaps more interesting to developers of typesetting systems for lengthy tomes of automatically typeset technical material than to SOHO users.

You might also be interested in:

2 Comments

Hi, Rick,

I think your brainstorm is worthy of study, but in terms of harmonizing ODF and OOXML, it conceivably could collide with very brittle code in the older implementing apps. (Yeah, I know that implementations are supposed follow standards rather than vice versa, but reality trespassed on that concept in the cases of ODF and OOXML.)

I don't know offhand of a corresponding problem with the OpenOffice.org code base, but Microsoft Word has had persistent and serious issues with page breaks and sub-documents since version 1.0. See e.g., the Y2k report of my study of Word bug reports relating to foot notes and end notes.

Not reported there, but several of the bug reports linked also mentioned identical misbehavior for other types of Word sub-documents.

To be clear, I have not updated that research since 2000, but one can reasonably suspect from the state of the app then that page breaks and subdocuments are probably still a problem.

Unfortunately, OOWriter and Word handle page breaks quite differently. It's not an easy issue for an ancient app still running 16-bit code in its page layout engine that wants to keep all content in objects.

E.g., consider a page that begins with a paragraph style from which a footnote is linked. Below the paragraph, a table begins but won't fit on the page. In both the portion of the table before the page break and in the continuation portion, there are other footnotes linked.

The page needs to break the first paragraph object, the table object, and the first footnote object linked from the table, and still allow for the footer object and perhaps a separate page number, and that's all before calculating the page break for the next page.

Now throw in a page break bug at the architectural level that's survived a dozen point-zero releases. Your task is to mess with that code to implement new page break algorithms.

The picture I get in my mind is something akin to a near-infinite chain of mirrors passing a reflection down the chain, where an adjustment of the angle of one mirror trashes the focal point in all others but the mirror you are trying to change the focal point in.

Perhaps folks might be tempted to change employers rather than touch that code?

Nb., to be clear, I am not suggesting that Word is the only word processor with persistent issues. E.g., my favorite word processor is WordPerfect and there are chunks of that app too that Corel engineers didn't want to touch even when Corel was a much larger company and WordPerfect had a much larger market share, despite severe usability bugs such as the QuickWords feature's insistence on a single file-locked data store.

To whomever designs the page style: Quit reducing the contrast by making the text gray. LEAVE IT BLACK!

News Topics

Recommended for You

Got a Question?