Thanks Syd. Your response gives me a little insight into the way the RFC is
being implemented by TEI members.
In answer to your question wrt the Registry Format. The next version of the
RFC, RFC4646bis, is currently being designed to incorporate an additional
6,800 languages (as soon as ISO 639-3 is published). At the moment the
Registry (hosted by IANA) has approximately 800 records this will increase
to 7,600 (ish) once RFC4646bis is published. The registry format is
currently Record Jar and this means that, more often than not, it needs
conversion into a format that can be imported by the various applications;
certainly the Registry format will not be conducive to easy human
readability. There is some discussion occurring on the IETF-LTRU forum in
this regard and proposals currently seem to favour retaining the Record Jar
format. However, for non-technical people (like myself and maybe
archivists) this poses a problem. The current discussion (literally in the
past half hour) involves an idea to create conversion tools that can be used
to translate the data into formats that can be used with (say) proprietary
As far as I am aware the XML and W3C communities have adopted RFC4646 and
will certainly adopt RFC4646bis, so yes it will be relevant to the document
that you are currently reviewing.
> -----Original Message-----
> From: TEI (Text Encoding Initiative) public discussion list
> [mailto:[log in to unmask]] On Behalf Of Syd Bauman
> Sent: 14 September 2006 15:48
> To: [log in to unmask]
> Subject: Re: RFC4646 (was RFC3066bis)
> > I understand the TEI has adopted RFC3066 for language
> tagging within
> > archived documents. Can you tell me whether the TEI use the
> > tagging standard (RFC) just within software (thus automated
> > or do TEI members tag documents "by hand".
> I can't say with any confidence what the majority of TEI
> projects might do. However, I can say with certainty that the
> vast majority, if not all, of the projects with which I have
> been directly involved either do no language tagging at all,
> or apply language identification to passages of text by hand.
> In P5 encoding this is accomplished using the RFC 3066 (soon
> to be 4646 or BCP 0047 or whatever) tag on an xml:lang=
> attribute. P4 encoding does not explicitly make use of
> standard language tags.
> > We are currently discussing the Registry format (currently Record
> > Jar) on the IETF-LTRU and I was wondering whether anyone
> here has any
> > opinions on the format.
> I'm sorry, I'm afraid I don't understand what the Registry
> format is or is for. Does this have to do with the "Language
> tags in HTML and XML (Working Draft in review)" W3C document
> that I've only just begun reviewing?
> I hope this helps (even though I doubt it)-: