Hello Stuart,

2011/1/19 stuart yeates <[log in to unmask]>
This is just a quick note that some versions of xsltproc and other libxml-derived tools won't validate language tags such as these.

If you are looking for implementations which do validate these, see http://www.langtag.net/ .

Best,

Felix

The version in version control handles a wider range of language codes (including 'rap' which we have) but I have not tested it on language codes such as "und-002"

cheers
stuart


On 16/01/11 13:09, Syd Bauman wrote:
[Originally sent 2011-01-04 20:52:41-05:00, but I had some trouble with
a new mail client. Sorry.]

Hi Felix!
You're absolutely correct, the ident= attribute of<language>  takes a
BCP 47 tag. You can find the documentation for this attribute, and
thus the BCP 47 tags, at
http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-data.language.html.
And I agree that the value of "und-002" is a good choice.
And while I don't think it is needed for Lisa's use case, it is worth
mentioning that when you wish to differentiate multiple undetermined
languages you can use private use subtags.  For example, if you have
two distinctly different languages both of which are known to be
African but no further details are available, you could use something
like "und-002-x-a" and "und-002-x-b". (There may be better ways, I
suppose, but …)


On 11-01-03 18:45, Felix Sasaki wrote:
Hello all,

if an BCP 47 language tags can be used here (like e.g. in xml:lang), you
might want to choose the sequence of
"und" (undetermined, as Paul and Sebastian stated)
"002" , a UN Standard Area Code
See section 2.2.4 of http://www.rfc-editor.org/bcp/bcp47.txt for details
about these codes. The BCP 47 langauge tag then would be und-002, see an
analysis at
http://fabday.fh-potsdam.de/~sasaki/lta/language-tags/q?input=und-002

Felix

2011/1/4 Paul F. Schaffner<[log in to unmask]
<mailto:[log in to unmask]>>

    Lisa,

    I think we'd probably just use the three-letter ISO-639-2 code "und"
    ('undetermined'), but that of course fails to capture the "African"
    part of the description. The ISO-3166 region designations can
    be used to further specify, but I do not think that they include
    continents (or anything bigger than countries), so you might
    have to make up a private-use qualification to "und-". I've not
    done that, so I  am speaking in ignorance, as always.

    pfs



    On Mon, 3 Jan 2011, McAulay, Elizabeth wrote:

        Happy New Year TEI folks!

        I am reviewing a TEI Header for a colleague and I'm looking for
        the best way to note in a TEI Header that an "unknown African
        language" is used in part in the document. I wanted to declare
        that information in
        <langUsage>
        <language ident="">unknown African language</language>
        </language>

        I've reviewed the P5 guidelines, but the emphasis seems to be on
        declaring known languages.

        Thanks,
        Lisa

        --------------------------------------------
        Elizabeth "Lisa" McAulay
        Librarian for Digital Collection Development
        UCLA Digital Library Program
        http://digital.library.ucla.edu/
        email: [log in to unmask]<mailto:[log in to unmask]>



    --------------------------------------------------------------------
    Paul Schaffner | [log in to unmask]
    <mailto:[log in to unmask]>  | http://www.umich.edu/~pfs/
    <http://www.umich.edu/%7Epfs/>
    316-C Hatcher Library N, Univ. of Michigan, Ann Arbor MI 48109-1190
    --------------------------------------------------------------------




--
Stuart Yeates
Library Technology Services http://www.victoria.ac.nz/library/