(1) makes me feel much more uneasy than (2), though I can understand the
preference for (1) if epigraphers really do divide their strings into
mutually exclusive classes of "name" and "word". But is this true?
Isn't the "name"-ness, a property additional to the "word"ness? So
<name><w>Gabriel</w> <w part="y">Boda</w></name> <w>fecit</w>
makes perfectly good sense to me -- and also simplifies the processing,
since a processor interested only in words can reliably just look for
<w>s, without having to be told also to consider <name>s.
I also feel that incompleteness and metricality are properties of
words, not of names. A name might spread across two verse lines, for
example. And "completeness" of a name might mean something different
from "completeness" of its constituent tokens. Is "L. Burnard" an
Gabriel Bodard wrote:
> I think this message was lost in the rush of the TEI MM. (Either that or
> it is genuinely uninteresting to everyone.)
> Let me state my options in the simplest possible terms:
> (1) I assume (as I have been up to now) that a name is not a word, and
> put in a feature request to add att.segLike attributes to name and
> related elements;
> (2) I tag all segmented or incomplete names additionally with a <seg>
> element. I don't much like it, but I can do.
> Gabriel Bodard a écrit :
>> If a word is segmented or the <w> tag contains an incomplete word, the
>> segLike attribute @part is available to mark this word as segmented. The
>> att.segLike class is not available on elements such as <name>,
>> <persName>, <placeName>, however.
>> It would seem to me that anything one might want to say about a word or
>> other grammatical segment (that it is divided, that it has metre or
>> rhyme or other function, that it has a lemmatized or normalized
>> headword) one will also want to be able to say about names. In our
>> corpus all strings of transcribed characters that we have been able to
>> so resolve are tagged either as <w> or as <name>, with the result that I
>> can segments words but not names.
>> Any advice?
>> Dr Gabriel BODARD
>> (Epigrapher & Digital Classicist)
>> Centre for Computing in the Humanities
>> King's College London
>> 26-29 Drury Lane
>> London WC2B 5RL
>> Email: [log in to unmask]
>> Tel: +44 (0)20 7848 1388
>> Fax: +44 (0)20 7848 2980