The discussion in BCP47, unless I completely misunderstand it, refers
(as it should) to *content* of the element bearing the xml:lang
attribute. It cites some typical cases where that content might be
considered as needing multiple language tags.
I don't think it has any bearing on the proposal which I think Stuart is
making... that one might use the tag (i.e. the value of xml:lang) also
to indicate something about the language of some other object than the
content of the element, namely some other thing in the real world which
the content of the element is describing.
Personally, I think that would be a serious (and very confusing) misuse
of this perfectly well-defined attribute.
If you say
<bibl xml:lang="ru"><title xml:lang="en">War and
Peace</title><author xml:lang="en">Leo Tolstoy</author></bibl>
you are saying
a) the content i.e. all the children of the <bibl>, including attribute
values , CDATA fragments, and recursively the content of its child
elements, are to be treated as being in Russian
b) the content of the <title> and <author> over-ride this by specifying
that their contents are actually in English
In this case, the same effect would be obtained by putting xml:lang="EN"
on the <bibl> obviously. But if there were some other content, such as
some CDATA, or some other non xml:lang-bearing element, then that would
be in Russian.
I don't think there are any, but if there were some properties of being
in Russian which affected the layout of the child elements then that
would apply. For example, if the <bibl> cited some language written
right to left, I would be unsurprised to find the order of the child
elements reversed by a canny formatting engine.
Crucially your are NOT saying anything about the original language of
the abstract entity which this <bibl> refers to. If I say <ref
target="#foo" xml:lang="FR">voir aussi</ref>, I am telling you that the
phrase "voir aussi" is in French, not that the thing pointed to by #foo
is in French.
On 06/04/11 07:08, Felix Sasaki wrote:
> 2011/4/5 stuart yeates <[log in to unmask]
> <mailto:[log in to unmask]>>
> says of xml:lang: "(language) indicates the language of the element
> content using a ‘tag’ generated according to BCP 47"
> Section 4.2 and 4.3 of BCP 47
> http://www.rfc-editor.org/rfc/bcp/bcp47.txt might help to decide about
> your issue. See esp. in section 4.3:
> "In some applications, a single content item might best be associated
> with more than one language tag. Examples of such a usage include:"
> http://www.w3.org/XML/1998/namespace says: "Designed for identifying
> the human language used in the scope of the element to which it's
> attached." http://www.w3.org/TR/xml11/ makes it clear that the scope
> in question is lexical scope (in the computer science sense).
> My question is whether xml:lang can be assumed to make implications
> about the semantic content of the tags as well as the character
> content. Or in other words are the two fragments are semantically
> the same:
> <bibl xml:lang="ru"><title xml:lang="en">War and
> Peace</title><author xml:lang="en">Leo Tolstoy</author></bibl>
> <bibl xml:lang="en"><title>War and Peace</title><author>Leo
> Stuart Yeates
> Library Technology Services http://www.victoria.ac.nz/library/