Print

Print


After some consideration, I no longer believe that the use of
parameter entities for configuration in the DTD subset of a TEI
document is a good idea.
 
If SGML applications use the public or system identifier to select a
stylesheet (etc), there _must_ be a 1:1 mapping between identifiers
and document structure.  I am not suggesting banning general entities
from the subset (they cannot affect the DTD), and I am not suggesting
eliminating parameter entities like TEI.prose; rather, I am suggesting
that their declaration in a document's _DTD subset_ be deprecated in
favour of a more robust scheme.
 
Of course, parameter entities used exclusively for marked sections
within the document entity should not be deprecated, though there is
the possibility that they might accidentally affect the DTD -- this is
a flaw in ISO 8879:1986, not in TEI.
 
 
* A SEPARATE PUBLIC IDENTIFIER FOR EACH BASE TAG SET
 
The most serious problem in current usage is that the following three
examples are all actually different (though closely-related) DTDs,
masquarading under the same public identifier (note that in the
absence of standard TEI public identifers, I have supplied my own -- I
could also have written SYSTEM "tei2.dtd" for each example):
 
  <!DOCTYPE tei.2 PUBLIC "-//local//DTD TEI P3//EN" [
    <!ENTITY % TEI.prose "INCLUDE">
  ]>
 
  <!DOCTYPE tei.2 PUBLIC "-//local//DTD TEI P3//EN" [
    <!ENTITY % TEI.verse "INCLUDE">
  ]>
 
  <!DOCTYPE tei.2 PUBLIC "-//local//DTD TEI P3//EN" [
    <!ENTITY % TEI.drama "INCLUDE">
  ]>
 
Since these are three different DTDs, they really require three
different identifiers:
 
  <!DOCTYPE tei.2 PUBLIC "-//local//DTD TEI Prose//EN">
 
  <!DOCTYPE tei.2 PUBLIC "-//local//DTD TEI Verse//EN">
 
  <!DOCTYPE tei.2 PUBLIC "-//local//DTD TEI Drama//EN">
 
Or, alternatively,
 
  <!DOCTYPE tei.2 SYSTEM "p3prose.dtd">
 
  <!DOCTYPE tei.2 SYSTEM "p3verse.dtd">
 
  <!DOCTYPE tei.2 SYSTEM "p3drama.dtd">
 
The '<!ENTITY % TEI.prose "INCLUDE">' declaration would appear in the
a separate, standard file (ie "p3prose.dtd") instead of at the top of
each document.
 
 
* ADDITIONAL TAG SETS ENABLED BY DEFAULT
 
The additional tagsets cause some different difficulties, but perhaps
it would be best simply to enable all of them by default -- that way,
an application processing a TEI document can be reasonably certain
about what kind of document it is dealing with.  There is no conflict
among different additional tag sets, and there is no reason to disable
them in the first place with today's larger and faster computers.
 
The only argument in favour of using parameter entities to enable
additional tag sets is that new sets can be added without outdating
older documents; however, if documents make it clear which version of
the TEI DTDs they use, and the version is updated as new tag sets are
added, this problem would not exist.
 
 
* EXTENSIONS REQUIRE THEIR OWN PUBLIC IDENTIFIER
 
Likewise, %TEI.extensions.dtd; should not be declared in a DTD subset,
since it would alter the document's structure without changing the
public identifier; instead, the file should begin with a new public
identifier associated with a file containing the TEI.extensions.dtd
declaration -- ie.
 
  <!DOCTYPE tei.2 PUBLIC "-//local//DTD TEI P3 Prose/Exercises//EN">
 
 
 
David
 
--
David Megginson                Department of English, University of Ottawa,
[log in to unmask]       Ottawa, Ontario, CANADA  K1N 6N5
[log in to unmask]      Phone: (613) 562-5800 ext.1203
WWW: http://www.uottawa.ca/~dmeggins  FAX: (613) 562-5990