Lou Burnard wrote:
> In my opinion, you would do better to put the lemma value into an element of its
> own. The attribute value approach is really only suitable for simple cases.
> Redefining the datatype of the @lem attribute to accept spaces as you propose
> would be a bit problematic since that changes the definition. Of course, you
> could also argue that it *shouldn't* be defined as data.word... but it currently is!
I get the impression that Elena makes her point on the basis of the TEI
strategy not to impose one particular analysis on the annotator.
Obviously, with the great customisation mechanism of P5, one can simply
change this single definition and be done with it, but the question she
poses is about the balance between the core and the periphery, it seems
I've just had a quick look at four dictionaries of English from my shelf
-- three of them (Webster, Collins, Hornby) contain multiword headwords,
one (Cosmo, small) doesn't appear to. The point is that multiword
expressions aren't exotic by any standard. It is up to the lexicographer
what lemmatisation method s/he chooses and of course, if a lemma is
treated as element content, there is no problem, but if it has to be
entered as an attribute, the TEI schema appears to be drawing a delicate
line (of perceived standards-adherence) where (by TEI's own policy, it
seems) it perhaps shouldn't.
This seems to be a case where a (possibly casual) 'programmatic'
decision in one little place (attribute data type) may have a bearing on
the entire architecture of one's abstract data model, at least its
default version, which may be perceived as in some way better, because
it doesn't require any changes to the spec, behind which an entire
I hope the above doesn't sound like pestering about a dead issue, it
just seemed worth bringing up (I admit to a weakness for such
meta-thingies ;-) ).