What are Chinese Tables?

By Rick Jelliffe
October 27, 2008 | Comments: 3

Chinese characters may be more complicated than latin letters, but they can express in two or four ems what might take two or four words in English: a little taller but much shorter.

So when faced with making tables where space is a premium, Chinese (and Japanese and Korean) writers and layout-ers naturally made use of these characteristics to allow denser tables. Chinese (and Japanese and Korean) itself is traditionally written with a grid too, which could themselves be considered a kind of table.

There is almost no recognition of these unusual Chinese table forms. For the last decade, there is, it is true, more support for some basic features (such as simple diagonal header split) in some mainstream word processor products, but in general there is little awareness. (Indeed, sometimes application developers actually turn off some Asian features in the Westerner versions of the software, even though the feature may be useful, to avoid the perception of bloat. FrameMaker used to do this for ruby text, I recall.)

You can see a request made for this kind of feature into OpenOffice at Diagonal Table Header Specification (ODF). But the kinds of features asked for there are just the thin end of the wedge, comparted to the richness of Chinese tables found in the hand-laid out tables.

One of my projects at Academia Sinica in Taiwan involved going through archives of Chinese printed books to find characteristic examples of Chinese tables. I used to have a website with dozens of scans of examples, but it has been taken down. I thought I would put some of the images that were saved in the Wayback Machine, by way of partial rescue.

(If this material looks familiar, I have previously put one of these diagrams in a blog item: Standardization as a collective failure of imagination)

Diagonal and Kite Headers

The first example is where the small size of Chinese characters makes it possible to collapse several rows of headers into a single upper left box.

This example is unusual, because it actually uses English words, so we can get a really good idea of what the function of the subcells are. You can see that, graphically, it is not a good fit for English: it would presumably be worse with a language with even longer words, such a German!

You can see that the split cell contains titles for 1) the top header row, 2) the body of the table, 3) the subheaders in the second column, and 4) the headers in the first column. You can see also the kite shape of 2) which intrudes into 3). This is some graphical license, but the kite shape is very typical.

(If you want to see the whole graphic, right click and your browser probably has a "Show Image" item on the pop-up menu. Then use the back button to return to this page.)
t-chou1.PNG

The following example is a much simpler kind of diagonal split-cell header. Its connection to the top row should be clear.

t-women2.png


Diagonal Data Cells

Not only headers may be split diagonally. Here is some data results. It seems that in this case the diagonal split is used to indicate that the data items in the cell are connected to each other. This is a semantic which seems missing from the vocabulary of Western tables. You can see it has a diagonal split header left botton, that matches the data items.

This kind of table has a graphical structure that we would perhaps more associate with a form rather than a table.

t-hk1.PNG

Finally, here is a very messy and repellent diagram. (If I remember, it was an old Taiwanese table assigning bopomofo letters to buses or trains for particular routes.) But rather than sneer, we should ask ourselves what graphical/writing problem is this layout solving?.

In this case, we have a basic table with three header rows and three header columns. And we have a split cell to give a label for the first two header rows and a label for the first two header columns. Then we have another split cell, giving labels for the third column headers and third column rows. Try doing that without split cells!

t-b2.png

In the markup world, we have been living in and perpetrating a graphically impoverished set of technical capabilities. This started early, with the OASIS CALS Exchange Table Model, where vendors, in order to get better interoperability, agreed to remove from the data format features that were not common. The CALS model has in turn highly influenced HTML (and even ODF and OOXML): the exchange subset becoming in effect the last word in what tables are supposed to be.

For some more info on ways to view these, you might also check out my old pages at Academia Sinica: Chinese Tables and Split Cells. It is a shame that the more extreme examples of Chinese tables have gone AWOL: some are more like little diagrams than what we conventionally think of as tables. We Westerners are of course allowed to think that these things are mad, funny, quaint, interesting, horrifying, etc: everyone is allowed to think that of a foreigner's culture: but I think we should be careful not to dismiss the usefulness of these idioms for CJK use, and even for our own use.

On the subject of Internationalization, there are of course other sources for requirements floating about, particularly as China is regaining its voice: as well as impacting CSS, it will also impact OOXML and ODF (which themselves can act as rich sources for formatting ideas for CSS.) One of the particular sources of information, outside the W3C, on Chinese requirements is the scanty details that can be gleaned about China's UOF format in English. There is a document UOF Translator Requirements on a Sourceforge UOF-OOXML translator project, and a document Comparison Document for the ODF to UOF translator project.


Update O'Reilly's formatting goblin's don't like ASCII art formatting for reader comments, so I am putting in some of the table from Maria's comments below.



| | Variable Group A |
_____________|________________|_________________________|___
| | subgroup a | subgroup b |
_____________|________________|____________|____________|___
row header 1 | row header 1.1 | data 1.1.a | data 1.1.b |
_____________| row header 1.2 | data 1.2.a | data 1.2.b |___


You might also be interested in:

3 Comments

Hello,

I have read your recent posts about table layout in East Asian languages with great interest, thank you for presenting this to the public. There is certainly something special about East Asian table layouts, it is more complex, more sophisticated, more spontaneous and therefore, inconsistent in some cases.

It has a lot to do with saving space and, as you already mentioned, with the compact layout of Chinese characters, which can be arranged more freely in several directions as compared to the alphabet. I will come back to this point at the end of my post.

As a scholar in the humanities working in multiple languages, German being my mothertongue, I am very concerned about how to code text and data reasonably and in a way that is beautiful and easy to understand. I am not happy with the HTML table model because it does not represent the data model beneath the data presented. (I am exclusively talking about tables presenting data, nothing graphical.)

The tables you present in this post are based on data as well, so it should be easy to analyse them. I am afraid this post will become a bit too long...

1. The "kite-design"

It seems to be a "wrong" implementation of diagonal splits: The diagonal partition shows 4 titles: "Qualities, Results, Methods, Data".

The "Results" partition seems unnecessary because it refers to the actual data cells, which need no naming.

3 Parts are left, the "Qualities" part, and only this part, classifying the column headers. Thus, the diagonal split for Qualities has to start in the left upper corner and end exactly in the right bottom corner in order to separate column headers from the row headers.

The row headers are hierarchically divided into two: The top part of the hierarchy showing "Data" (groups?), which are subdivided into groups according to 3 Methods. This hierarchy is not reflected in the "kite" design of the top left box. The kite design shows something like a flat structure, methods and data being on the same level (at the starting point of the kite). This may be true since the methods are 1 through 3 for every data group, still, the table has decided (and always has to decide) for a hierarchy.

It may be subject of discussion whether such "kite design" is a welcome aid to hint to a really flat hierarchy, where the constraints of table design have to show one hierarchy where there is none in reality. There is no common agreement about this (and I think no one even talked about this), and so I would not like to see this kind of design spreading. It would also disrupt the clear division between the column headers classification and the row headers classification.
In his actual example, a diagonal division for "Qualities" and simple naming of "Data" and "Methods" on top of the row headers without any linear division seems most appropriate to me.



2. I agree with your idea about the second example, but I do not see any reason to introduce a kind of diagonal split here. The data could be presented with a simple slash (like: 17% / 0%; this seems to be intended with the split) or divided into to two rows for better comparison. The reason why I don't think the solution presented here is favorable is simple: There is no common agreement about presenting data like this, nobody knows exactly what is meant, there are other solutions which can be interpreted with more reliability. Furthermore, there are more graphical tables in which splits like this indicate a rather fuzzy division between cells.

3. The third table is an interesting example, and despite some critique it shows very well what diagonal splits can offer:

In a table, variables are presented in columns, data sets/groups/units are presented in rows. Often, variables and units (or whatever one may call them) are grouped or subdivided, such that column headers and row headers may consist of several rows and columns representing unit-groups, units, subgroups or variable groups, variables and their subgroups. Each level in these hierarchies may be divided into more subhierarchies.

This table shows a 3-level hierarchy in variables, with one hierarchy subdivided: from the top of the table downwards it is
1 _order_ (of something -- I did not check the vocabulary, since it would take me days, it seems to be a phonetic explanation of the bopomofo)
2 _parts_ (of something). This is divided into a
2.1 _top_-something subgroup and a
2.2 _bottom_-subgroup.
3 simple names (I hope that is the correct translation)

This means, 4 rows for column headers.

As for the units, they are divided into a supergroup "method" (written in the left split of the diagonal split top left), which consists of, as I interpret the table, 3 groups:

1 situation/appearance
2 (something)
3 air current or stream

The problem is the third method, since it is not mentioned among the exactly vertical colums below the diagonally split cell, but is added, being itself the left part of the diagonal split in the 3rd cell of the fourth row from top. It is thus separated from method 1 and 2, but since it is partly beneath the "method" diagonal split and its content seems to be related to "method", I think it refers to methods, and thus the table design here is not the best possible. In addition, why is it put into the same cell like the variable name for "simple names"? The simple names may have to do with method (3) "air current", if so, it is good to express this somehow in the table (there are solutions for this); the problem is, that the diagonal split may or may not tell us something here, but we do not really know. Even worse, if there is no relation between air current and simple name.

A table like this without diagonal splits would create a lot of white space: boxes on the top left that cannot be filled with information about column header classes or row header classes, and rows or columns that cannot be filled because of row headers or column headers. It would look something like this:


| | Variable Group A |
_____________|________________|_________________________|___
| | subgroup a | subgroup b |
_____________|________________|____________|____________|___
row header 1 | row header 1.1 | data 1.1.a | data 1.1.b |
_____________| row header 1.2 | data 1.2.a | data 1.2.b |___


Diagonal split of cells containing information about column and row header names saves this white space. I think it is not accidental that diagonal splits are particularly common in East Asian tables since, as you pointed out, the characters make it a lot easier to arrange the text in the cells than it would be with Western writing systems.


These thoughts on table layout are closely related to the data model that tables are based on: HTML tables and their ancestor table models do not reflect the real structure. I hope very much that improvements in table design -- in whatever standard -- is accompanied with a "correct" implementation of the data structure. In other words, whatever the design of the table, the data should be encoded in the one and only correct way that their structure requires, and the design, whether with diagonal splits or whatever, is another matter.


As a summary, the notion of diagonal cell splitting should be seen as an improvement: Information can be presented in an extremely compact way. So I agree that the ability of diagonal splitting should be an option in modern text layout. Still, it should be thoughtfully applied in order to reasonably represent the data structure, the constraints due to technology may help to create even more reasonable tables.

Good design should reflect meaning, it should be easy to understand. Old design may sometimes be charming, exotic, mysterious, a pleasure to "decode", but my personal view of typology and design is that it should improve to become understandable at first sight. The "Chinese Tables" do not reach this aim, but they present ideas and tools that help improve table design as a whole.

Therefore, including this option into standards seems a good idea to me. I do not only think of ODF but also of CSS and XSL, and I do not only think of diagonal splits but also of table data encoding that reflects the data structure rather than a certain design.

Thanks for listening. It is a topic I am contemplating about since several months, so I was glad to express my ideas to someone who might be interested, I hope I did not repeat too much of what you already thought and expressed.

Maria

Maria: That is a very interesting and thought-provoking response.

I certainly agree that it is possible to use diagonal splits to make horrible tables! But that is also possible with spans, or any other features in tables :-) So it is no reason against them.

And I agree that the diagonal slash in the third data cell can be regarded as content, rather than division into two cells. However, it seems clear to me that in the mind of whoever made that table, they were dividing the cell into two.

And I think it is this fuzziness between what is diagramatic and what is tabular that characterizes "Chinese tables". I had many more examples of quite bizarre tables.

The closest that I can think of in the West to this is the Periodic Table of Elements. It is a diagram presented as a table. In my old scans, I had several examples where tables consisted of a collection of subtables: more perhaps like what people might do in a spreadsheet with two different areas populated by data.

Finally, on your comment that "results" is unnecessary: one of my findings from looking at "Chinese tables" (they are CJK and there are marginalized versions of their features in the West too, of course) is that there is a tendency for the top-left header cell to be a kind of thumbnail guide to the whole table.

If we view the function of the top-left cell(s) as being thumbnail descriptions of the table regions that have been skewed so that their lines connect to borders better, the headers make more sense.

Of course, there is an aspect of idiom here: if you are used to tables having this kind of thing in the top-left, they will be easy to interpret. If you are not used to it, they will seem strange.

However, it is not the purpose of computer typesetting to dumb down our graphical idioms. The marketing line is that PCs are supposed to liberate us! I have mentioned before that the separation of data and rendering in diagrams in OOXML DrawingML (aka SmartArt) is a really great innovation in this regard.

Chinese do not, in my experience, see a great need to explain their cultural artifacts to foreigners; contrast this with Japanese, for example, where there is a kind of pride that outsiders may be interested in their culture coupled with an analytical bent. This is why, I think, the Japanese are so much further ahead in explaining their requirements than the Chinese are. It is not a criticism of either cultures (and it is a stereotype, if applied too strictly!)

However, in the case of these table forms, I think there has been a real lack of awareness of how important these table forms were, pre-1990, and how applicable and useful they still can be (not only to CJK, but also to the outside world.)

My first reaction to the "kite" table is that it's a creative reaction to dealing in two dimensions with something that needs to be in more than two dimensions. In my early days of editing, 30+ years ago, I frequently ran into such tables in technical documents and was known for spending hours with a knife and paste, reconstructing tables into other two-dimensional arrays that attempted to overcome some of their problems.

My later reaction would be that static media (e.g., the printed page) might not be the best for such data. Today I'd be inclined to turn such things into Topic Maps to enable rotating the data in live polyspace, as it were.

I agree that the CALS and HTML table models have greatly dumbed down our ability to present tables, even in two dimensions (and I was a member of the CALS committee). I have recently spent two months working with a major XML software vendor about the limitations in their rendering of HTML tables. They can't deal with our house style, something I helped to develop and then typeset quite adequately over a quarter of a century ago using troff and tbl.

The issue of whether to do real data representation or just presentational arrays (such as HTML) has been one of the enduring challenges in my decades of SGML and XML development. Every now and then, when I have a really significant data issue, I break down and do a custom application. What holds me back to the compromise of HTML tables for most things is a combination of issues: (1) Commercial vendors make it really hard to present custom data structures (given that they even have problems with HTML). (2) Even if I can create a stylesheet that properly renders my custom structure, I have to think about the issues of documentation, training, and leaving a legacy application for those who come after me.

News Topics

Recommended for You

Got a Question?