Print

Print


Dear world,
 
the following is a short text about trouble I faced
while preparing my first real hands-on experiment
with TEI tagging. Old-timers may well sentimentally
wag their heads on these greenhorn problems, still
I'd like to share them. If responses are not too
negative, I may come up with more in the future :-)
 
     Tobias
 
PS: I tend to express myself in straight sentences
    like "I want.." or "I will...". Take this to
    mean "I tend to think so, what do you think?"
 
 
                  Coding Henry James: `Daisy Miller'
                  ==================================
 
 
The TEI encoding of Henry James's `Daisy Miller' is not more than a
first experiment in actually using the tools I have and the TEI coding
schemes.  The experiment will touch common encoding questions as to
what feature is to be encoded how -- and an easy introduction into the
coding of textual variants, because the story exists in the original
version from 1878/9 and the version as Henry James had it published in
the New York edition; the differences between these versions are
interesting enough to justify their encoding.  Of course, though, to
encode the differences between two linear texts will expose only a
fraction of the possible problems of encoding critical texts.
 
Here, I'd just like to address two principal problems concerning
`tagging philosophy' that I struggle with.  I'd like to come out again
with more practical questions and their tentative answers when I'm
farther with the practical work.
 
Clear categories.
-----------------
 
If I want my encoding to be useful as an approximately objective
information, I feel it necessary to restrict myself to encoding clean
categories that are accepted by others, too, and that give meaningful
results when they are used as the base for a research.  My typical
horror would be a tag <irony> for marking up text passages considered
ironical -- glad to say, this tag does not exist in TEI.  I would
equally not dare to use <q> tags when encoding a stream of
consciousness as the Lotus Eaters chapter or the Proteus chapter in
Ulysses to indicate where direct speech or `direct thought' is
present... the boundaries seem simply to hazy to me, although there
are _some_ clean and obvious quote signs in these texts.  When you tag
something present, you all too strongly say it is absent where you
_don't_ tag it.
 
Features vs. Phenomena
----------------------
 
While I tried to compile a what->how list for appropriate encoding of
the Daisy Miller text, I ran again and again into the same type of
problem, I give it here for <foreign>  and <rs> tags:
 
Foreign words are generally rendered italic in the printed text
(Complete Works, ed. Leon Edel), but not consistently. So, "marchese"
is rendered italic, but "cicerone" is not. I face the decision of just
to code the typographical feature where I find it and consider it to
be <foreign> rather than <emph> -- or to decide for myself what words
to consider <foreign> and tag them all.  The first is a clear policy,
but is basically useful only for typesetting the text anew -- a
research inquiring into the use of foreign words would not catch half
the interesting material by hunting for <foreign> tags.  The second is
more useful for researchers, but means massively more work and more
tags -- and it implies my more or less arbitrary drawing of a
borderline around the <foreign> category (what do native english
speakers feel about a word like "dyspepsia"?  Is it <foreign> or
perhaps <term>, or Standard English?).
 
The TEI standards (and the implied meaning and value of TEI tagging)
seem to suggest the second policy, although they often speak about
`deciding what a typographical feature stands for' [no actual
quotation].  How cumbersome a burden this may become is perhaps made
clear with another example.  The Green Book gives in section 6.4.1 an
example for the use of <rs> tags to encode reference to persons,
places, or the like.  The example is drawn from Jane Austen's `Pride
and Prejudice'.  If an encoder proceeded to follow this example,
though, she or he would hopelessly flood the text with <rs> tags,
because, after all, the whole novel deals with nothing but people
referring to other people (and, occasionally, places).  If one were to
restrict the tagging to `selected' or `interesting' references or
referenced objects, though, the usefulness would diminish rapidly,
again, because the tags don't cover the whole material, and the
selection policy that seems clean and obvious to one will certainly
appear messy and cumbersome to others.
 
I don't really see a way out of this dilemma, that tends to come up
quite often when I try to determine if and where to apply a tag.  The
obvious answer that this is the realm of a `responsible decision of
the editor' does not help very much either -- in any case I'd like to
hear about the way other `responsible editors' face this decision.
 
 
 
--
...........................................
:       [i Tromsoe til ca. 6'95]          :
:                                   :-)   :
: Tobias Rischer                          :
: Tunveien 9 A21                          :
: 9018 TROMSOE                            :
:   NORGE                                 :
:                                         :
: email:                                  :
:   [log in to unmask]                :
:  ([log in to unmask])    :
:.........................................: