On Thu, 25 Apr 2002 at 14:19:56 Charles Muller wrote:
> but I would prefer to use a different tag, if an appropriate one is
> available. Also relevant here is the way I have been using the "lang"
> attribute in my dictionaries, where the attribute values of "lang" are
> always an ISO 639 value, such as lang="ko", lang="ja", lang="en", etc. I
> guess I could make an exception and add CJK as a possible attribute
> value, but it would then be getting a bit unsystematic.
You have touched a sensitive issue. The problem is that as defined
in the TEI, "language" does not mean "language", it means
"language written in a particular writing system". Consequently one
cannot use a particular value of the "lang" attribute either to tag all
instances of text written in a particular language (irrespective of
writing system), or to tag all instances of characters written in a
particular writing system (irrespective of language). Since these
are both things that one needs to do, this is a highly unsatisfactory
situation and it is to be hoped that it will be remedied in the P5.
The solution is to modify the TEI DTD to introduce a further global
attribute indicating the writing system. Than you could mark up text,
for example, like this: "The word <w lang="ko" ws="cjk">[a
word]</w> means [whatever it means]." Moreover, if you wanted to
write the Korean word in transcription, you could write: "The word
<w lang="ko" ws="lat">[a word]</w> means [whatever it means]."
If you want to see an example of how this solution to the problem
has been applied, you can have a look at my edition of the
Budapest Glagolitic Fragments, and in particular at the way the
commentary is marked up. You will find it at