Print

Print


> I am involved in converting a print encyclopedia to SGML and we
> are making use of the TEI lite DTD with the Name and Date
> extensions.
I'm not 100% sure what this means, but if it means you have taken a
flat TEI Lite DTD and added elements from chapter 20, then you have
violated TEI's copyright. TEI Lite is not an extendable DTD, it is a
derived "view" of the (very) extensible TEI DTD system. If what you
want is a DTD that is similar to TEI Lite but has the elements and
attributes from chapter 20 in addition, use the full TEI DTD. With
the advent of the "pizza page" web sites for TEI DTD creation[1],
it's easy, fun, and educational for the whole family!
 
> ... In particular I am interested in applications of the extended
> tag sets for names and dates ...
We have made heavy use of parts of the additional tag set for names
and dates. In particular we make use of PERSNAME, PLACENAME, and
ORGNAME, but not the various sub-parts thereof (e.g., SURNAME,
FORENAME, ROLENAME; SETTLEMENT, REGION, COUNTRY); we make extensive
use of the key= attribute for PERSNAMEs, but generally not for other
elements. Similarly we use DATE and TIME as declared in 6.4.4, but do
not make use of the various sub-parts thereof (YEAR, MONTH; HOUR,
MINUTE); we make extensive use of the value= attribute as declared in
6.4.4, but do not use zone=[2], type=, or certainty=[3].
 
For names, we have found it important to have criteria with which to
determine whether or not a string is a proper noun, and whether or
not an item named is a person. (Which of the following are PERSNAMEs
-- Lou the First, King of the Encoders; Lou of Encoderville; Lou
Encoderville; King Lou; the King; President Clinton; Mr. President;
Slick Willy?)
 
We uniquely identify each PERSNAME that names someone who exists
outside the text (historical figures; and fictional & mythological
figures which are part of the general culture) with a key specified
on key=. For a complete discussion see [4].
 
The standard form we use on value= of both DATE and TIME is an ISO
8601 extended form (as complete as possible). We do not use DATERANGE
or TIMERANGE because a span of time can be represented on the single
attribute value= of DATE and TIME, rather than needing the from= and
to= of DATERANGE and TIMERANGE.[3] For information on ISO 8601 see
[5].
 
Our internal documentation on encoding policies for names and dates
is not yet publicly available (every time I mention them I say "in
a few months" :-), but I'd be happy to send a copy upon request.
 
-- Syd Bauman, EMT-Paramedic
   SGML & XML programmer / analyst
   Brown University Women Writers Project
   [log in to unmask]      401-863-3835
 
Notes
-----
[1] See http://www.uic.edu/orgs/tei/sgml/pizza.html, and
    http://www.hcu.ox.ac.uk/TEI/pizza.html. If usage is not
    self-evident, ask questions here. Many of us who use the pizza
    pages, as well as their authors, read this list,.
[2] Because zone information, if desired, can be included
    in the standardized value of value=.
[3] Because, to some extent, the precision is indicated by the
    standardized value of value=. This is not completely true,
    though, nor is there any way to indicate accuracy; but that's not
    what certainty= is for. Probably the additional tagset for
    certainty and responsibility would be needed to indicate
    accuracy.
[4] http://www.wwp.brown.edu/vol02num03/nameKey-home.html
[5] http://www.cl.cam.ac.uk/~mgk25/iso-time.html is an excellent
    starting point. The page gives an overview of the standard, some
    of the reasoning behind it, some discussion from a programmer's
    point of view, and pointers to the standard itself and the draft
    of a forthcoming version. My only complaint is that the author
    buys into this "24:00" nonsense.