CSDL: Conceptual Schema Definition Language

The nail in the coffin for W3C XML Schemas?

By Rick Jelliffe
February 1, 2010 | Comments: 2

CSDL (Conceptual Schema Definition Language) is the schema language available under the OSP of the ODP (Open Data Protocol). ODP is a technology that originated at Microsoft and is spreading from there.

I suspect it is the nail in the coffin for W3C XSD: small is beautiful! CSDL models the Entity Data Model (an entity-relationship (ER) model.) I think CSDL is to RDBMS data what RELAX NG is to publishing-documents and reports.

Classifying schema languages

When we look at schema languages, there are three basic issues (even though usually schema languages don't present themselves as functions, but as equations which some data will satisfy):


  • What is the input data structure?

  • What kind of grammar is the schema language?

  • What is the output data structure? (Is the schema language defined as a recognizer or transducer?)

For example, in RELAX NG the input is an attribute-value tree (XML), the grammar is regular (and ambiguous), and the output is a boolean (valid or invalid.) Of course, an application is free to produce richer information as the output of validation, but it is not in the standard.

Schematron's default, in contrast, is that the input is an XPath Data Model (a directed coloured rooted graph, a path-rich XML), the grammar is context-free (because of XPath variables and current()), and the output is an XML attribute-value tree (i.e., SVRL).

XSD's input data structure is an attribute-value tree (XML), the grammar is regular (and non-ambiguous: the famous UPA for example), and the output is a non-XML tree (PSVI).

CSDL's input data structure is a multi-rooted, directed, possibly cyclic graph: in particular, the Entity Data Model: entities and associations; the grammar is regular, and the output is not defined. However, it seems that can be used for serialization, for validation and for data import back into an RDBMS.

How does CSDL manage to have a graph as its input, if it is XML? By the simple trick of defining a simple wrapper element: Entity Data Model for Data Services Packaging Format (EDMX). All your entities can go below that.

Other CSDL characteristics of interest

  • CSDL does not share anything with XSD. It has a data type declaration system that looks the same, but it does specify (to make it all clear) that all datatypes are derived from xs:string. Any attempt to reconcile the two type systems is therefore application dependent, not standard. There is a simple facet system for creating new datatypes (derivation by restriction, but CSDL does not use type derivation terminology.)
  • CSDL does support kinds of choices and a thing called a ComplexType, in which order is important. But these are exceptions rather than the usual case.
  • For multiplicity on associations, CSDL allows 0..1, 1,* at each end.
  • Functions can be defined.
  • Referential constraints allow a kind of parent/child relationship to be declared.
  • Probably the most interesting thing to me in CSDL is the onDelete element:
    <OnDelete> is a trigger that is associated with a relationship. The action is performed on one end of the relationship when the state of the other side of the relationship changes.

    The example given is

    <Association Name="CProductCategory">
    <End Type="Self.CProduct" Multiplicity="*" />
    <End Type="Self.CCategory" Multiplicity="0..1">
    <OnDelete Action="Cascade" />

    </End>
    ...

    "Cascade" is not a very illuminating word. But I think the idea is that (in order to support delete-ish operations) the strength of coupling between entities needs to be modeled: i.e., is the coupling between the entities which are ends of an association so strong that if the first is deleted, the other should also be deleted (and should this be cascaded).

    This seems to me to be a really important and useful idea, and one could be applicable to any graph-based or schema language, including Schematron.



You might also be interested in:

2 Comments

Where's the plain English version of this Rick? Like, why's it useful? What's it replace? How's it used?
Looks like you've been swallowing dictionaries.

DaveP

Dave: CSRL would be useful when

1) You have relational data/systems/skills

2) You don't mind having a fairly flat data format (i.e lots of links or keys)

3) You want to escape using XSD because you are sick of people's heads exploding

4) RELAX NG doesn't bring much to the table because it does not know about links or keys or validating them

5) Schematron does not bring much to the table because it is not declarative enough or too powerful

It is part of the Microsoft Open Data Protocol (OPD). Basically, this seems to be Atom plus extentions, with data sent using CSDL schemas. The kind of space where you might be thinking SOAP+WSDL+WS* perhaps, then shoot yourself.

I suspect, on no evidence whatsoever, that MS has now (after 10 years of XSD) had long enough to write off their XSD investment and look at the failures of the XSD-based systems, and decided that what they need is something much simpler than XSD and its nutty type systems for the next generation of platform: less all-singing, all-dancing, all-confusing. (Any Microsofties care to comment on this? If you are not too busy patenting whatever is not nailed down that is :-) )

News Topics

Recommended for You

Got a Question?