Fun discussion.

On Tue, Dec 11, 2012 at 7:39 AM, Lou Burnard
<[log in to unmask]> wrote:
> Anyone who expects to process an SGML document without a DOCTYPE is of
> course barking, but I guess you might argue that an XML document without one
> is within the bounds of possibility. But why would the judge in her chambers
> expect any XML document necessarily to behave the same way in the presence
> of a DOCTYPE or without one?

Well it depends on whether the judge is an SGML expert.

If she is, she'll say "what is this notion of processing a document
without its DTD? How do you expect to infer all the element boundaries
without the tag omissibility indicators in their declarations? What
about any uses of SHORTREF that the DTD designers have introduced?"

At that point, counsel for the defense might say "But Your Honor, this
is XML, and XML parsers are not obliged to process a DTD, nor are XML
documents obliged to refer to one. Features such as tag omissibility
were removed partly in order to enable this. This was a deliberate
design decision, made in view of the fact that when exchanging encoded
documents over the web, imposing a requirement to process the DTD
every time the document is parsed is onerous -- especially in cases
where the DTD is many times larger than the document. It also makes
conformant parsers significantly harder to build only to support a
feature that is a frequent source of hard-to-find bugs."

At that point, the prosecuting attorney points out this is a flaw in
the XML architecture, since it means that documents parsed with or
without their DTD may be different, in view of attribute value
defaulting if nothing else.

The defense suggests that treating the DTD as a specification for a
document transformation, as opposed to a specification of constraints
for purposes of validation, is the true design flaw, and disallowing
attribute defaulting in schemas entirely would be a better solution.
The prosecution points out that many systems rely on schemas to
specify what amount to transformations, such as assignment of data
types. The defense erupts in incoherent babbling about XSD.

The judge throws up her hands and suggests this is a matter for the
jury. So the three exit the chambers and return to the courtroom,
where the young journalists wonder why all the fuss, since they use
RelaxNG for validation and XSLT pipelines for normalization and
defaulting, and have never even given the issue any thought.

> This isn't a problem unique to the TEI.

It certainly ain't. I've argued against setting default values in
schemas in other contexts as well.


Wendell Piez |
XML | XSLT | electronic publishing
Eat Your Vegetables