I am very keen on modularity in standard technologies. Each standard is targeted and allows the selection of subsequent layers.
The basic reason is pragmatic: they work. Your implementation becomes a stack, which naturally causes libraries and therefore co-operation and re-use. The tops of the stacks can change as new ideas and technologies arise, without having to re-invent the wheel or lock their users in. When plurality is allowed, then market/bazaar effects can operate, where suppliers and demanders of technology don't have technical barriers to their preferred technologies.
So I think modular/layered/stacked standards need to be right-sized..
In fact, I would go as far as proposing the following rule of thumb: no open standard should make a technology that would take an experienced and expert developer more than one month (full-time) to develop.
Anything that took more than this, should be sliced into layers (and namespaces) or even abandoned as not appropriate for an open technology. Think SVG Tiny rather than SVG. Think RELAX NG rather than XSD Structures. Think XML rather than SGML. Think Dublin Core rather than MARC.
Why do I think this? One reason is that we need more James Clark-friendly standards. I don't mean James only, but Open Source software projects have long been based on the work of motivated individuals who get an adequate basic implementations out in a reasonable time, then make it available for maintenance. I would go as far as claiming that large (i.e. in complexity and requirements not tickness) standards are profoundly anti-open source and therefore anti-competitive.
The next reason is that smaller standards naturally lend themselves to distributed development as standards. In XML, it seems that it is only when there is a change of notation that this is feasible: in XSL we saw this with the three layers: XPath, XSLT and XSL-FO where the layering has allowed XPath and XSLT to have lives of their own independent of the other parts of XSL. In XSD, the part that people are most happy about, the datatypes, is exactly this kind of small standard. In ODF, the Open Formula WG has been able to proceed in parallel to the XSD WG. The CALS table model (in the form that spurred OASIS, the OASIS CALS Exchange Table Model) had a life and influence much larger than the military CALS program.
My third reason is that the larger a standard gets, the more difficult it is to maintain by its standards group and by its developers. We all have heard of spaghetti code, but a large standard may be sloshed in sauce bolonaise just as well. Small standards require and embody a separation of concerns.
And the final reason is cross-pollenation. The less that a standard focuses on a single topic of a technology, the more that it will be suitable for adoption by others. The more that standard has baggage (from the POV of potential adopters) the less likely it is that they will choose to re-invent the wheel rather than participate in that standard by adopting it.
Of course, not all paper-large standards are complex or poorly organized: sometimes there is an internal layering. But this kind of internal layering should be brought out into separate parts.
For example, for XML Schemas, I think there is a pressing need for a XSD-lite which would be, in effect, RELAX NG in drag, and suitable for databinding rather than validation. Only the built-in types, with no facets or explicit type restriction, union or list derivation. No complex type derivation. In fact, remove the whole type derivation superstructure (which is not to say that simple and complex types cannot be named and used, it is just the derivation.)
Then make all the rest into separate layer with separate parts: datatype derivation, key/unique/assertion checking, PSVI, complex type derivation (including content model checking, UPA checking, name redefinition checking, etc), schema composition. XSD is certainly big enough for at least 5 layers. And there are the DSDL technologies that can be used in combination as well: NVDL and CDRL for example.
Now you might say, but this is madness speaking! There are so many technologies that are much bigger than the single developer: think of the size of Xerces or Eclipse for example. I would respond by saying that if you do look at the technologies that are thriving in the sense of having multiple contributers, it is largely because they allow the kind of meaty but limited contributions I am talking about: Emacs may be complex now, but making a new mode for a new notation is definitely this size.
In a sense, what I am talking about is RFC-sized technologies. The key Internet technologies are described in small standards for the IETF (Internet Engineering Task Force), and are often called RFC (Request For Comments). The IETF places a lot of emphasis on running code for its technology, and its technology traditionally has come out of the research community, where the teams are small. So the technologies that tend to come in to (and therefore out of) the RFC process are exactly of the kind of size I am suggesting.
Now I would not confuse what I am calling for with minimalism. Nor as an attack on existing under-layered standards: you don't kick a man when he is down. Agile programming teaches us the value of re-factoring and of having a development method and expectations that encourage refactoring. Standards committees perpetrating large underlayered standards should consider the practical advantages of layering.
Now I guess some people might think this is just Rick putting the boot into open standards championed by corporations and corporate-funded consortia that have enough resources they don't need to worry about implementability (!) But I am not really talking about limits to the total size of some final technology, rather about the limits of each slice of that technology. ODF and OOXML may be too big to expect a single developer to implement them even in many years, acting in isolation; however that is not the point. A large cake just needs more layers.
(In the case of OOXML and ODF, we can see that they already have many layers: the packaging, the different notations for formulas and so on, the different namespaces, the removal of compatibility features to different parts. But we need to look inside each namespace to see if it meets my suggested requirement. ODF and OOXML both have cloudied the issue by making all parts optional in effect, which means that defining some interoperable core in some target class of application is first necessary before estimating how long it would take to implement.)
But it does go back to a point I have made several times on this blog over the last few years: the more that our laws require the use of open standards, the more that we will need to make sure that the kind of "openness" involved or created by those standards actually allow grass-roots market-enhancing (which may in some cases be a euphemism for 'disruptive') implementations. We would do better if our new standards were right-sized (and if existing ones were refactored to be right-sized.)
The Global War against Bunches of Dilettantes?
Regular readers will be aware of the comments I have pointed out by a representative of one of the big US monopolies, to the effect that openness just means that anyone who wanted to participate in a standard was not unduly blocked from participating, and that the primary stakeholders were the developers or vendors, who can be trusted to speak for their users since they have a commercial interest in making sure the standards meet their users' requirements. Nutty, or brazen, but true: it is a long way from the ISO view of standards as being an agreement by stakeholders.
I see that the same gentleman, indefatigable in his attempts to limit openness to US corporate big boys, recent even spoke of non-vendor participation at SC34 as a bunch of dilettantes . The basic principle is nothing but "we know what is good for you." (Is it any different from Sam Goldman's line when I want your opinion I'll give it to you? I am unconvinced by corporate paternalism.)
Actually, it is entirely expected: the gentleman is nothing if not consistent in following the Karl Rove playbook to pick the strength of the people you perceive as an attacker and try to make it a weakness: so I have commented that the ODF TC at OASIS is dominated by commercial corporate vendors, and the response is to say that SC34 has too many non-commercial corporate types. A better response would have been to call for broader participation at ODF TC (as I have done) and the same for SC34 (as I have done.) In fact, I am sure the gentleman involved will be delighted to learn that I am stepping up my participation in SC 34 WG4, as an invited expert, because our customers are requiring more OOXML work (i.e. the corporations, agencies and entities I work for, not related to MS!)
But the dilettantism that he is slurrinig is, to my mind, exactly the same kind of dilettantism you might accuse Richard Stallman of: that of the user insisting of having a measure of control (or the ability to influence) the technologies they have to use: in the case of Stallman, this gets conducted in the arena of home-made software and open-source, in the case of standard technologies, it has to be conducted in the arena of standards bodies and open standards.
And if dilettantism is really a concern, then the solution is more modularity, so that specialists in one area do not have to deliberate on issues outside their specialty. (Of course, I don't buy the whole 'dilettante' argument. Martin Bryan's phrase "standardization by corporation" applies here: we cannot expect large corporations to value industry-specific, specialist, boutique, niche, or RFC-sized standards and their processes, but we can demand our seat at the table. Standards need more participation. Standards work involves both domain knowledge and standards-process knowledge, and there is no necessity that one person can do it all or know everything.)
So I am favouring the term Open Technologies rather than Open Standards: meaning technologies and their enabling standards which don't exclude implementation for reason of size and complexity, just as much as for reasons of openness or language or timezone or IP or corporate afilliation or technological tradition.