Lou Burnard wrote:
> This discussion is spectacularly missing the point. I've already made
> these points on the Council list, but I'll say it again here since the
> issue seems to have migrated to this list.
> The existing @lemma attribute is provided as a means of supplying a
> linguistically normalized form of something. Whether you like it or
> not, and whatever you were told in school, it is sometimes going to be
> the case that the "something" concerned is written as more than one
> orthographic word; likewise its normalized form. Whether you tag the
> "something" concerned with <w> or <seg> or <mw> (which is the BNC tag
> that James mentioned), you will still have this problem.
> Elsewhere and passim in the Guidelines for P5 we've taken the view
> that attributes should not be used to supply "textual" values -- and
> at P5 we enforce this by means of datatypes. So the current @lemma
> attribute has a datatype which doesn't permit strings containing
> spaces. This is *not* for linguistic reasons related to a theory about
> what lemmata are or should be -- it is because we don't support the
> appropriate schema datatype.
> The choice on which I asked for Guidance from the Council (and now ask
> the TEI-L readership more generally) is whether we should
> (a) continue with the existing system
> (b) *remove* the @lemma attribute in favour of a <lemma> child
> (c) redefine the @lemma attribute to use a different datatype which
> does permit included spaces
> FWIW, I think
> (a) is sustainable only with the sort of adhoc rule I suggested to Elena
> (b) is likely to be perceived as a nuisance by many existing users
> (but is at least consistent with the rest of the Guidelines)
> (c) special cases this attribute, and will immediately lead to
> requests for us to special case others similarly
> P.S. some examples of "words" that contain spaces: "of course"
> "n'est-ce pas" "che bella" "et caetera" etc.
> And what is the "lemma" for "words" like "ain't" or "dinna" or
> "dontcha"? Quite likely to include spaces in some kinds of lemma-
> analysis methinks (oops, there's another one)