Making HTML more like print

Things are improving, but so slowly

By Rick Jelliffe
March 18, 2010 | Comments: 14

I loved the design aspect of print: laying out pages, choosing, matching or adjusting fonts, making pre-processors to allow really smart typography choices, tweaking typography settings, re-writing paragraphs for nicer breaking, and so on. At the end of the day, you can have a really satisfying thing you can hold in your hand.

The move to HTML involved a lot of loss: it turned upside down many of the skills, dumbing down what is possible, making it easier to produce bad results than good results, dumb markup, dumb styles, dumb type, ugly pages. Looking at the web, you see individual sites that are OK, but there seems to be a rule that the more contributors a site has, the more chance that none of them were really involved figuring out the site design. I am enjoying being a grumpy old man!

(<rant>The redesign of The Atlantic site recently, and the efforts of its bloggers like Andrew Sullivan and users to get back to something reasonable, seems typical rather exceptional to me. You don't need to look further than the narrow-column design here at OReilly and how poorly it handles code fragments, let alone tables, headings or other things considered superfluous by the supposed designers, for other examples. Is there some design school which teaches whatever you do, don't do a comprehensive analysis or ask stakeholders? The emphasis is not on technical communication.</rant>)

So I spent an afternoon recently looking at how far, in early 2010, it is possible to get the kind of quality typesetting you can readily get in print. Sorry I cannot show the mockups, the data belongs to the customer.

Hyphenation

The first thing I looked at was hyphenation. Hyphenation allows you to fit more information on a page, which would correspond to less scrolling, and makes full justification aesthetically possible once you have more than 4 or 5 big words per line: this can give nice sharp right-hand lines and a more pleasing design.

Good news here! The Hyphenator is a great project at Google Code which adds hyphenation. You add a very simple JavaScript code, a lang attribute on the body element to select the dictionary, and then select the elements to hyphenate explicitly on the class. This is very tedious to author directly (I didn't try whether cascading the hyphenating style would work) but perfectly good enough for HTMLs generated by XSLT etc. But it works great.


Fonts

Next I looked at better fonts, but with a less good result. Downloading a particular set of quite lauded free fonts from a quite comprehensive site and the result was pretty horrible with ugly artifacts smudging the edges. Checking back on the nice pages that used this particular font, I see they had quite a dark background, I suspect to hide these smudges. So it seems that many of the free fonts available on web may be better used for headings at a large size rather than for body text.

If anyone can recommend a good free Garamond/Aldus humanist font which is very sharp at all increments between 8 and 14 points, I'd love any tips!

But then oops. The combination of the font and the hyphenator, which was fine on Firefox and Opera didn't work on IE 7: the font was selected and the hyphenation broke the line, but the hyphen itself was not shown. I didn't have the patience to figure out what was going on.

Sidebar

Next, I wanted to make a sidebar to hold asides or comments on the main text flow, synchronized with it. It turned out that I needed to have an extra container for these, but it turned out successfully: I couldn't get baseline alignment between the main text and the container, but that is no biggy: a little more work would be OK.

Dynamic Layout

Finally, playing around with the design parameters, I tried a layout where if the window was small there was one column, and if the window went large, it would flip to a two column design. The trick here is to make a div (or list) container which is set to 'inline' so that it would split. It worked OK, but I think I would need to put both columns inside yet another div to allow inner stretching margins and outer fixed margins.

Ultimately, I think anyone used to, say, the TeX style system of boxes and penalties will find the CSS system of max, min, fixed (em, px, in, cm, etc) to be clunky. I just don't think the CSS model works very well without a lot of work. Too much work, even on browsers that comply to the standard and have good ACID results.

However, CSS is great for at least supporting widths in per cents or ems. And if the Japanese Layout effort at W3C I18n manages to turn around our unwittingly racist (or just overworked) type-engine developers, and get grids better supported, CSS may become much better. But still I don't see it as anywhere near approaching the control of what was even typical for print.

Flash

To get good print, it seems like Adobe's Text Layout Framework for Flash is so far ahead of HTML as to be in a different class. It is open source, but it is a foot in the door for Flash, which probably rules it out in most cases. The big draw back seems to be that it is a block level API, so things like tables or layouts you have to roll yourself. You can have beuatiful lines easily but more complicated than that is way beyond what any SOHO user would want to use.

Things are Improving, but so slowly

So the result: I don't see that current HTML+CSS gives anywhere near the same quality for typesetting that was easy a decade or two ago. However, things certainly are improving, and things like the Hyphenator appearing out of the blue improves the situation in a big lurch.

But it seems that the improvements have mainly been in the area of supporting magazine-style or advertisement-style layouts, while body text is pretty unsatisfactory: for body text, we seem little better than the days of the IBM Selectric typewriters 40 years ago, where people had three or four replaceable golf ball heads, and everyone's documents looked pretty much the same. The same thing was true in the early days of PostScript printers, with its few standard built-in fonts, but matters soon improved: it hasn't happened on the web yet, at least for body text.

I think, at the moment, that the best strategy for getting good layout would be to implement your own layout system on top of CSS: for example, as a jQuery plugin. The jQuery site already has many useful additions for improving tables and other layouts including multi-column layouts that provide raw material.


You might also be interested in:

14 Comments

Hi Jeff

remember PrinceXML attempts some thing similar - using CSS and XHTML/XML to create pdfs from web=pages - it's now arrived at version 7.0

www.princexml.com

Hi Prince

This blog item is about getting nice dynamic screens with qualities like print. That you have to move to a different technology to look nice is rather the point.

The HTML in this entry looks broken… some link didn’t get closed properly or something?

Aristotle: Thanks for pointing that out: SNAFU. Some editor or process at O'Reilly inserted an advertising link in (without my knowledge or consent, btw, though it is no biggie) saying that there is a lot of great material on HTML on the O'Reilly site, which is very true, but their markup was somehow wrong and rather undermined both the blog and the link I will attempt to get it fixed.

Hi

Thanks for linking hyphenator.js
If you don't like to add class="hyphenate", there are many other ways to enable hyphenation. See http://code.google.com/p/hyphenator/wiki/en_PublicAPI#property_selectorfunction

BTW: it is cascading, so works well and hyphenates all elements.

Regards,
Mathias

Matthias: My pleasure. But thank you for doing the work and making it available.

re TLF: have a look at Lovely Reader

Lindsey: Lovely Reader seems to have a nice re-columning system that is the kind of thing I'd want. The examples I saw had ugly fonts (typewriter) with no full justification and hyphenation.

Having this or that is not difficult. Where is the HTML platform that makes it easy to have them all (e.g. hyphenation, varied and beautiful fonts at body text size, sidebars, em layout, and dynamic layout)?

hi Rick,

Thanks for trying out lovelyreader, a few answers to your comments.

1. Here are some screenshots of the reader: http://bit.ly/coYoAV, we are currently using system fonts so on a mac you will get Helvetica or similar typefaces.

2. We are working on a dynamic typeface solution which will provide beautiful and ultra compact fonts library to each book, this grants authors great control over how the content will be displayed, and readers more control over how they want to read it.

3. Full justification along with a bunch of other reading preference settings are already implement, will be there in next release.

4. Hyphenation is something we are working on, had tried all the hyphen libs you mentioned, will keep you posted on this.

Thanks.
vic
Co-founder, lovelyreader.com

Mattias: Oh, by the way, is there some chance of enhancing the API to allow overriding which character is used for hyphenation and which font/style?

That would allow some kind of work around for the problem I had, where IE for some reason did not display the hyphen for open source font.

I saw this news website that had interesting approach yet simple.outsidemma.com

Basically they just put each news article in a box and float them to left. It's so simple yet effective and depending on your screen size you'll get different number of boxes to fit the width of the page. Unfortunately this approach is only used on their homepage. But I think it's a very simple and effective way for news websites to layout their articles as a compromise to column layouts.

Ramon: Thanks for the link: I found www.outsidemma.com worked better.

I have added Hyphenation and column movement to the www.schematron.com website, for people interested.

One downside with Hyphenation library is the copy/paste as plain text from a web page to a text editor, it's interspersing dashes throughout the plain text content.

Example from www.schematron.com in Notepad:

A lan-guage for making as-ser-tions about the pres-ence or ab-sence of pat-terns in XML doc-u-ments. See this overview for more in-for-ma-tion.

If you paste in other editors accepting HTML (Word, this very HTML textarea where I'm actually typing), the dashes will not appear, unless you specify "copy as text without formatting" when available.

Olivier: That is an interesting thing.

There is an option on the hyphenation library that puts a little floating button top-right of the page to turn hyphenation on or off. I wonder if it would help on the cut behaviour.

News Topics

Recommended for You

Got a Question?