LISTSERV mailing list manager LISTSERV 16.5

Help for TEI-L Archives


TEI-L Archives

TEI-L Archives


TEI-L@LISTSERV.BROWN.EDU


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

TEI-L Home

TEI-L Home

TEI-L  November 2008

TEI-L November 2008

Subject:

Re: Why RNG instead of DTD?

From:

Wendell Piez <[log in to unmask]>

Reply-To:

Wendell Piez <[log in to unmask]>

Date:

Wed, 5 Nov 2008 19:07:31 -0500

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (82 lines)

Hi,

Coming to this a little late....

At 08:16 AM 10/30/2008, Hugh wrote:
>DTDs are very powerful things, but I've come to the conclusion that
>features like entities and default attribute values are Bad Things
>whenever you have data that you might use other XML tools on.  For
>example, if I want to use an XSLT to add id attributes to certain
>elements in my document, the mere fact of running said document
>through the stylesheet will resolve those entity references and strip
>my DTD declaration from the document.  I have to have the XSLT add it
>back, and God help me if I have an internal subset.
>
>All of this can be handled one way or the other, but the point is that
>with DTDs I have a rule set that changes the shape of my document when
>I parse that document.  So the document as it appears to my processing
>tools (like XSLT), which are looking at it post-parsing may be quite
>different from the document as it appears to a person looking at it.
>Errors and misconceptions abound.  An identity transform is not an
>identity transform.
>
>So I've come to the conclusion that it's a fundamentally bad idea to
>use schemas that contain instance data (like entities).  And if you
>aren't going to use the parts of DTD that make it so powerful, then
>why not use something with namespace support, etc.?

Keep in mind that the XML DTD is essentially a subset of the SGML 
DTD, which provides a great deal of other sorts of support to parsing 
and markup, including tag omissibility and so forth, which were 
disallowed in XML in order to keep parsers small and simpler to 
implement than SGML parsers (in an age when machines were also 
considerably larger and faster than the machines for which SGML was designed).

In this context, the close dependency of instance and its 
declarations makes more sense than it has come to in XML, in which a 
much looser binding has come to be the norm. The exception to this 
rule would be heavy data-crunching applications, where compiling 
datatypes (and thus a tighter binding between instance and schema) is 
so rewarding for performance reasons. For the most part, in document 
processing and especially in projects that must focus on flexible and 
responsive document modeling (can anyone say "TEI"?), loose binding 
pays off in a considerable reduction of complexity, inasmuch as 
modeling and markup can usefully be isolated from one another, and 
the price of more machine cycles and memory used up shuttling bits 
around isn't too high to pay.

The things Hugh complains about here, general entities and implicit 
attribute values, are precisely the detritus left from the old days 
because they were considered, in 1998, still too valuable to lose. 
Now, with Unicode, better user interfaces, XML Include and XSLT, they 
are perhaps much less compelling than they were then.

I wrote about some of these issues in my way-old paper, Beyond the 
'Descriptive vs Procedural' Distinction ... findable on that Internet thing.

DTDs still work well enough to be ubiquitous in certain environments 
where they always worked well. There, the best reason to use them is 
probably "if it ain't broke", etc.; underlying this is the (not 
inconsiderable) migration costs, given that DTDs and document sets 
are still evolving together, and projects find it hard to freeze even 
one or the other (and certainly not both) long enough to redesign and retool.

Yet it could also be said, as I once read in an advertisement for 
machine tools, "If you know you need new tools, and you haven't 
already bought them -- you're already paying for them."

Cheers,
Wendell



======================================================================
Wendell Piez                            mailto:[log in to unmask]
Mulberry Technologies, Inc.                http://www.mulberrytech.com
17 West Jefferson Street                    Direct Phone: 301/315-9635
Suite 207                                          Phone: 301/315-9631
Rockville, MD  20850                                 Fax: 301/315-8285
----------------------------------------------------------------------
   Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================

Top of Message | Previous Page | Permalink

Advanced Options


Options

Error during command authentication.

Error - unable to initiate communication with LISTSERV (errno=111). The server is probably not started.

Log In

Log In

Get Password

Get Password


Subscribe or Unsubscribe

Subscribe or Unsubscribe


Archives

ATOM RSS1 RSS2



LISTSERV.BROWN.EDU

CataList Email List Search Powered by the LISTSERV Email List Manager