I'm not sure this is a very nice solution, but for an old project a few
years ago we used two (whitespace separated) values in the @lemma
attribute for alternative (or multiple) lemmata.
Problems:
(1) if your lemmata can contain whitespace, obviously this breaks.
(2) the semantics are probably very wrong
(3) this doesn't distinguish semantically between cases of uncertainty,
on the one hand (e.g. <w lemma="apple apricot">napply</w>) or multiple
lexicographic lemmata for a single orthographic word on the other (e.g.
<w lemma="καί ἐκ">κἆκ</w>). For us this wasn't a problem, because we
wanted both to behave in the same way, i.e. to index under both words in
both cases.
(I present this not so much in the hope that this will be a viable
solution for you, but rather that reactions against this solution might
help to reveal the correct solution. :-) )
Best,
Gabriel
On 2011-05-11 22:04, Arun Prasad wrote:
> Hi all,
>
> I'm currently toying with the idea of using the TEI format to represent
> lemmatized Sanskrit texts, much as the Clay Sanskrit Library was doing
> some time ago. My question, though, is quite general. I've been using
> this sort of structure to represent inflected words:
>
> <w type="verb" lemma="bhR" ana="#3s #pres #indic">bharati</w>
> <w type="noun" lemma="buddha" ana="#masc #ins #sg">buddhena</w>
>
> and it's worked well so far (although I don't know if this is indeed the
> proper way to do things). What I'm having some trouble with, however, is
> the representation of participles. A participle has some basic "stem",
> but it also comes from a verb root. So, the inflected participle
> "bharan" could have one of two values for its lemma: the participle stem
> "bharat" an the verb root "bhR."
>
> If possible, I would like to encode both of these values together. Is
> there any easy way to do so in the TEI format?
>
> Thanks,
> Arun Prasad
--
Dr Gabriel BODARD
(Research Associate in Digital Epigraphy)
Department of Digital Humanities
King's College London
26-29 Drury Lane
London WC2B 5RL
Email: [log in to unmask]
Tel: +44 (0)20 7848 1388
Fax: +44 (0)20 7848 2980
http://www.digitalclassicist.org/
http://www.currentepigraphy.org/
|