Selfishness and Rubies

Two steps forwards and one back? CSS selectors, WebKit, HTML5 and Ruby Annotations

By Rick Jelliffe
March 10, 2010

Today I have been preparing a course I'll be teaching touching on JQuery and XPath, and I thought I'd make a little graphic showing the increasing incremental power of CSS1, 2, 3, XPath 1, XPath 2, XSLT1, XQuery, XSLT2. On the way, looking at proposals for CSS selectors, I found this fascinating comment from one of the WebKit (the leading FOSS browser engine) developers:

just recently implemented more of the CSS3 selectors in WebKit, and it took a really large amount of work and multiple iterations to make the HTML5 spec viewable again after adding support for dynamic nth-child, last-child and sibling selectors. The spec literally became unviewable in WebKit for a while as I worked to optimize and optimize those selectors to try to get back the performance lost from adding support for dynamism.

In the end, I got back the performance, but at the cost of bloating memory.

If you think a parent selector can't render pages unusable in 2008, you are wrong. It can quite easily. The problem with this selector is that badly written rules could lock up a browser. An additional concern is that authors would use these rules just for alternative browsers (leaving IE to perform well and degrading Opera/Safari/Firefox performance).

I think you do not fully appreciate what it means to have to support selectors in a dynamic environment.

Badly written rules can lock up a browser? Err, doesn't he mean a badly written implementation? But the thing that grabbed me was that additional concern. If a user felt that browser performance was worth sacrificing for the feature for their document, why on earth not?

Ruby Madness

In other WebKit news, I see that Ruby text is now being supported, at least for horizontal text. Excellent news for CJK users. What I had not realized is that now we seem to have 3 competing technologies on Ruby text all from the W3C: the XHTML Ruby spec, the CSS ruby, and the one in HTML 5.

The primary difference is that the XHTML/CSS way keeps runs of text together: the Ruby spec has this example:

<ruby xml:lang="ja">
  <rbc>
    <rb>斎</rb>
    <rb>藤</rb>
    <rb>信</rb>
    <rb>男</rb>
  </rbc>
  <rtc class="reading">
    <rt>さい</rt>
    <rt>とう</rt>
    <rt>のぶ</rt>
    <rt>お</rt>
  </rtc> 
</ruby>

So there is a one-to-one correspondence between each base (rb) and each ruby text (rt). The advantage is that the base and the ruby text do not interrupt each other: in particular, so if displayed on a browser or system that does not understand Ruby annotations (such as a typical text indexer), the characters in each word or name are in the correct positions. (In the case above, the reason for using it is that the last kanji 男 has only a single reading character お while the other kanji require two characters: if the simpler ruby form was used, you would get a span of 7 characters over 4, which would not be as clear and could (in a long span or line break) be a little confusing.

HTML5 requires that this be done by interleaving.

<ruby>
斎<rt>さい</rt>
藤 <rt>とう</rt>
信<rt>のぶ</rt>
男<rt>お</rt>
</ruby>

Now this is clearly much simpler markup. But someone would have to be fairly demented to use it IMHO. At least until you are sure that Google and the search engines know to treat this kind of rt element specially, the tradeoff that HTML5 forces on CJK users is that if they use ruby annotations, all that text will be scrambled and won't get indexed properly.

The HTML/CSS Ruby specifications gave us the promise of two steps forward: I think HTML5 has taken that promise but then taken one step backwards.


You might also be interested in:

News Topics

Recommended for You

Got a Question?