Hi Syd,

2011/1/16 Syd Bauman <[log in to unmask]>
[Originally sent 2011-01-04 20:52:41-05:00, but I had some trouble with a new mail client. Sorry.]


Hi Felix!
You're absolutely correct, the ident= attribute of <language> takes a
BCP 47 tag. You can find the documentation for this attribute, and
thus the BCP 47 tags, at
And I agree that the value of "und-002" is a good choice.
And while I don't think it is needed for Lisa's use case, it is worth
mentioning that when you wish to differentiate multiple undetermined
languages you can use private use subtags.  For example, if you have
two distinctly different languages both of which are known to be
African but no further details are available, you could use something
like "und-002-x-a" and "und-002-x-b". (There may be better ways, I
suppose, but …)

Here, I guess, if you have two different languages, you could try to bind them to region / country codes, e.g. "und-011" (western africa) vs. "und-014" (eastern africa). That would help to avoid private use subtags.



On 11-01-03 18:45, Felix Sasaki wrote:
Hello all,

if an BCP 47 language tags can be used here (like e.g. in xml:lang), you
might want to choose the sequence of
"und" (undetermined, as Paul and Sebastian stated)
"002" , a UN Standard Area Code
See section 2.2.4 of http://www.rfc-editor.org/bcp/bcp47.txt for details
about these codes. The BCP 47 langauge tag then would be und-002, see an
analysis at


2011/1/4 Paul F. Schaffner <[log in to unmask]
<mailto:[log in to unmask]>>


   I think we'd probably just use the three-letter ISO-639-2 code "und"
   ('undetermined'), but that of course fails to capture the "African"
   part of the description. The ISO-3166 region designations can
   be used to further specify, but I do not think that they include
   continents (or anything bigger than countries), so you might
   have to make up a private-use qualification to "und-". I've not
   done that, so I  am speaking in ignorance, as always.


   On Mon, 3 Jan 2011, McAulay, Elizabeth wrote:

       Happy New Year TEI folks!

       I am reviewing a TEI Header for a colleague and I'm looking for
       the best way to note in a TEI Header that an "unknown African
       language" is used in part in the document. I wanted to declare
       that information in
       <language ident="">unknown African language</language>

       I've reviewed the P5 guidelines, but the emphasis seems to be on
       declaring known languages.


       Elizabeth "Lisa" McAulay
       Librarian for Digital Collection Development
       UCLA Digital Library Program
       email: [log in to unmask] <mailto:[log in to unmask]>

   Paul Schaffner | [log in to unmask]
   <mailto:[log in to unmask]> | http://www.umich.edu/~pfs/

   316-C Hatcher Library N, Univ. of Michigan, Ann Arbor MI 48109-1190