Excellent result for @charset detection of CSS in WWW browsers

By Rick Jelliffe
September 15, 2008

I have long pushed that all text formats should have rigorous indications of the encoding used, especially for any formats used for mission-critical data. It is the other end to the rule "Never use the default write methods in the API": the only ways to not have broken encodings is either to only allow a single encoding or to have point-to-point (and end-to-end) explicit labeling of encodings.

In CSS, there are a variety of mechanisms for this. I think the use of an initial @charset is the best and most XML-like.

The W3C Internationalization WG is a really great activity: super important and perhaps a jewel in W3C's crown w.r.t. other standards bodies. They have recently put out a set of tests for encoding detection in CSS implementations. Check the W3C site for the tests and latest results: I have added the results for the current version of Google Chrome (tested September 2008) and taken out individual tests (so tests are multiple, so some entries are "redder" than others), but these things are all in flux and will be out-of-date.

Apologies if the following table is being scrolled way down...

UA IE Firefox Opera Safari IE Chrome
version 7 3.0.1 9.51 3.1.2 8 Beta
OS XP XP XP XP XP Vista Business
date 20080808 20080808 20080808 20080808 20080808 20080916
1 HTTP no yes yes yes yes yes
2 @charset yes yes yes yes yes yes
3 link charset no yes yes yes yes no
4 inherited yes yes no yes yes no
5 default to UTF8 yes yes no yes yes yes
6 BOM yes yes yes yes yes yes
7 BOM with @charset yes yes yes yes yes yes
8 Non-initial @charset yes yes yes no yes no
9 HTTP vs @charset no yes yes yes yes yes
10 @charset vs. link charset yes yes yes yes yes yes

Original table copyright © 8 Aug 2008 World Wide Web Consortium, (Massachusetts Institute of Technology, European Research Consortium for Informatics and Mathematics, Keio University). All Rights Reserved. http://www.w3.org/Consortium/Legal/2002/copyright-documents-20021231

So this looks pretty good for adopting the same policy for determining the encoding for CSS files as you use for XML: if there is a BOM then use that (i.e. your document is in UTF-16 of some kind); otherwise use explicit labeling with an initial @charset.That works with all the current generation, which is really great.

I think avoiding defaults (i.e., avoiding no labeling) is prudent, and not primarily because Opera doesn't yet implement UTF-8 as the default, but simply because confusion about defaults has been the bane of interoperability: explicit labeling within the object itself is much better than external labeling or defaults for robustness, etc, as a practical matter.

These results are much better than I expected. To me, the red marks tend to be for legacy things that I don't regard as best practice, necessary or perhaps even desirable. Well done to all the developers there, this seems a really solid result for all the browsers!

You might also be interested in:

News Topics

Recommended for You

Got a Question?