Word processors and typesetting systems get their printing characteristics from the typesetting algorithms they use. For example
- Microsoft Word expands expands spaces between words in order to fully justify a line (i.e. make the start and the end of the line flush with the margin.)
- WordPerfect can squeeze spaces to allow the same thing (at least, according to Microsoft)
- TeX both squeezes and expands spaces, to get an optimal result.
And for deciding whether to hyphenate a long line
- Word only looks at hypenation on previous lines to decide whether to hypenate
- TeX look at the whole paragraph to see what the optimal hypenation should be.
I gather that OpenOffice has the same default behaviour as Word: any readers with better knowledge please feel free to comment.
So we can see that Microsoft Word's algorithms will err on the side of adding spaces; there are other algorithms in Word, such as the paragraph and page breaking settings, that act similarly. We can say that Word's designers were very averse to crowded pages.
TeX's designer, Donald Knuth, on the other hand, was averse to non-optimality. This is why TeX—or typesetting engines forked from it—has been the typesetting system of choice for many years for the quality typesetting market.
So which is better? Crowded pages? empty pages? Optimal pages? For draft, empty pages. For professional documents intended for reading, optimal. For reference technical material, probably crowded. That is probably as far as I would have gone, until recently.
Lets say documents have 50 lines per page. And 25 of these are short lines, headings or blank lines. And the different algorithms used by different products will result in an extra line being sent to the next page in about 10% of those documents. And these will result in an extra page being printed in about 2% of cases. And humans will clean up half of those. That leaves us with 1% of documents having an extra page printed using systems that err on the side of adding space. It does not need to be scientific number.
But if we consider there may be one hundred million word processing documents printed every day (anyone know the real number?) That would mean a million extra pages per day. It would be a fun college project to get a better estimate.
Now, paper is usually made from estate timber, so there probably is no SAVE THE TREES deforestation angle. But paper production takes energy, toxic bleaches are used, power is used to make it, fuel is used to transport it, if it is disposed by burning the carbon gets released, and more toner cartridges are used. A tiny effect for individuals, but a decent effect when aggregated.
Greener stylesheets and templates?
So perhaps it would be better for our typesetting software including word processors to default to tighter typesetting.
For example, governmental and corporate stylesheet and deployments may care to check the Word options like
- Do full justification like WordPerfect 6.x for Windows.
- Don't expand character spaces on the line ending Shift-Return.<./li>
- Don't add extra space for raised/lowered characters.
- Suppress "Space Before" after a hard page or column break
- Automatically hyphenate document
Actually, most of these options come from a page at IPBA on setting up Word to get less horrible typesetting that I recommend.
Many of the XSLT stylesheets and templates made for transforming XML into a word processing or typesetter's native format pay a good deal of attention to making sure that each paragraph style has good properties for breaking, keep-with-nexts and keep-togethers: autoformatting is often used for professional and bulk publishing where unnecessary whitespace is a $ expense or where there are fixed format restrictions.
All the different applications have different algorithms. As systems increasingly adopt standards, and as the pressure for standards to converge, decisions on which algorithm to use will increasingly need to be made. But at the moment there is no compelling technical superiority of the competing algorithms: indeed, it becomes a matter of taste where some people prefer whitespace and others prefer concentrated text. You say tomato and I say tom-
An environmental argument that disfavours profligate page generation can provide a new angle for discussions in standards bodies about socially-required features and convergence targets.
In the case of OOXML (and I suspect this applies to ODF in extent but certainly in kind), I think the standards processes (at SC34 WG4/WG5 and liase with OASIS?) should review IS 29500 and IS 26300 from this angle and make the best features available. And encourage the vendors to make the environmentally-friendly features the defaults.
In concrete terms, it comes down to issues like this. Widow and orphan control relates to paragraphs that have been broken so that only a single line appears at the top (widow) or bottom (orphan) of the page. The simplest way to implement widow and orphan control is to break the paragraph a line earlier than the detected widow or orphan. So we add some whitespace at the bottom of the earlier page and move a line to the next page.
But a much better system, both for optimal typesetting and according to this green angle, would be to bring the extra line forward from the previous page/column, if the text frame or printing area on the previous page was large enough to allow marginally tighter line spacing, or if there was some discretionary area still white at the bottom of the column.
WordPerfect implements a system where it will even adjust the margin sizes to allow better fit. They even were granted a patent for it. Another sign of the toxic nature of software patents, this probably has prevented others from adopting it; and, since people tend to believe their own marketing, you can expect vendors who haven't historically provided this kind of feature to downplay its usefulness: they are making a virtue out of necessity of course.
While I am not suggesting that everyone should adopt the WordPerfect auto-fit system (if it's patent has elapsed) or that a standard should require it. The point is that some vendors who are used to saying that "we don't provide feature X therefore users don't require feature X therefore feature X is not useful therefore we should not support feature X in the future" (which in its extreme ODF-partisan form morphs to something like "therefore no software should support it since it will prevent interoperability.")
But, just as internationalization and accessibility provide spotlights for objectively re-evaluating capabilities of technologies and the clauses in their standards, we would do well to also look at environmental audits of our standards for page-producing technologies.