Print

Print


Dear reader,
 
A few questions/remarks on things I have not come to terms with:
 
  1) The document element's name for TEI P3 documents.
  2) Use of public identifiers.
 
(hope I haven't missed earlier postings on this) and an invitation:
 
  3) Ref to HTML/PS article on semantic specifications.
 
If needed, reply on TEI-TECH, but this may be something for TEI encoders
to comment upon.
 
1) I'd like to hear how you think about the following. I guess most of you
simply use <TEI.2> as the document element for TEI texts. This is debated
below.
 
I always construct my TEI documents with something like:
 
| <!DOCTYPE TEI.Prose PUBLIC "-//TEI//DTD TEI P3//EN" [
|
| <!-- tell tag set that document type name is TEI.prose -->
|
| <!ENTITY % n.TEI.2 "TEI.prose">
|
| ......invoke prose ..
|
| ]>
| <TEI.prose>.......</TEI.prose>
 
This seems a correct approach, as
 
    A) a system may be able to process special constructs for prose,
    while some other system may be tailored to process verse DTDs. It is
    correct for a system to determine processing characteristics
    required & expected based on the document element's GI (i.e. the
    document type name).
 
    B) a prose text is not of the same 'document type' as a verse text.
    This must be expressed through document type names, and subsequently
    through the name of the document element.
 
          Side:
          The TEI is not a DTD, but an 'enabling architecture', or set
          of tag sets, for creating DTDs.
 
Now, in line with the approach advocated by the HyTime standard that
introduces 'architectural forms', systems may be instructed to take on
the processing semantics of the _architecture_ associated with the
document element (or any other element, for that matter).
 
The TEIform= attribute specifies that the document element is of the
'architectural form' "TEI.2", meaning that TEI.2 semantics apply to the
document even when named TEI.prose.
 
         Side:
         P3 specifies that the TEIform is introduced (only?) to tell the
         application processor that some _renamed_ element X still
         conforms to the TEI form Y.
 
Subsequently,
 
| <TEI.prose TEIform='TEI.2'>.......</TEI.prose>
 
is implied, and
 
| <TEI.2>.......</TEI.2>
 
is largely intended by this. This seems a correct approach for
generalizing national alterations of element names.
 
   Side:
   I still do not understand why attributes are not deal with in the same
   way.
 
The question here is: does the TEIform attribute really apply to the
document element? If so, please share your considerations on this, and
if not, how must we adapt the P3 to reflect inherent differences between
prose, verse, drama, etc. (i.e. must the <TEI.prose TEIform='TEI.2'> be
removed?) Should this extend to the additional tag sets?
 
  Side:
  Here I assume element semantics doesn't change in different DTDs; they
  may however in different _contexts_ (ancestor and other contextual
  elements).
 
2) I have recently updated my TEI2.DTD and related entities to include
entity text by public identifiers. The distributed version uses system
identifiers. They are 'hard-coded': relocating a file could mean the TEI2
'web' gets broken. The regular way of avoiding this is the use of public
identifiers. I therefore advocate public identifiers -- for all parts of
the p3, not only for the TEI2.DTD entity itself.
 
I'll explain this. System identifiers are _not_ file names; rather, they
are named data objects, to be resolved by some data manager into entity
text, to be included, in turn, in the SGML document. That the data
manager is usually a file system, and that file systems usually support
a 'current path', is of no importance in distribution of (part of) an
SGML document.
 
    Side:
    The data could well be contained in a zip file, a relational
    database, or the post office. This imposes --in my view-- a serious
    problem when local but external data is to be included in a document.
    Consider:
 
      <!ENTITY local.data SYSTEM "local.identifier" NDATA some.notation>
 
    This may work fine on a file system, where local.identifier maps onto
    a file name in the current documents path, but how about a relational
    database where local.identifier is a key value, and the data manager
    extracts the 'file' field from the relation? How do we port that?
 
Subsequently, each entity should be named such that _on any system_ the
individual entity references can be resolved to entity text. This
enforces, in SGML terms, the use of public identifiers for any part of
the P3; data entities can either be identified publicly, or by some
system identifier (see also side note above).
 
This calls for a P3 version formally structured though public
identifiers, and (still) informally mapped onto files in distribution
through a catalog file.
 
The catalog distributed as ftp://ftp-tei.uic.edu/pub/tei/dtd/fpi by
Michael and Lou doesn't physically upgrade the tag sets to the FPIs.
As far as I did web-crawl, anyway.
 
3) Finally, please feel free to comment on
 
   http://www.let.ruu.nl/C+L/loeffen/art/stinfon/stinfon.ht
 
This is an article on semantic specifications for SGML encoded
documents.  It may be interesting to TEI encoders and -users.
 
Thanks in advance,
Arjan.
   Arjan Loeffen Faculty of Arts Utrecht University Achter de Dom 22-24
  3512JP Utrecht The Netherlands ++31+302536417 voice work ++31+302539221
      fax work ++31+206855772 voice home [log in to unmask]
           http://www.let.ruu.nl/C+L/loeffen/home.htm