Yes, the value of xml:lang definitionally specifies the natural language
of all children, including the attributes, of the element that carries it.
Yes, this was an issue which caused some concern in some quarters
(Espen, are you still there?) when the issue of adopting xml:lang was
first discussed, during the move to P4.
In P3 the scope of the @lang attribute is rather ill defined. It
probably was intended to relate only to the element content, but I am
not sure that anyone ever thought through the full implications of that.
Certainly it's unclear how exactly you would specify the language for
one attribute but not another without doubling the number of attributes.
Anyway, one of the consequences of that decision was that the War on
Attributes promptly broke out, and we moved to the present simpler world
in which attribute values rarely if ever use natural language, so just
don't have to worry about hyphenation rules, script rules etc. They are
(mostly) sequences of specific unicode characters to be interpreted as
symbols only, despite their occasional resemblance to real language
words (the same might, in passing, be said for the element or attribute
As Laurent has already pointed out this really doesn't seem to be a
major problem. There is full scope for defining and controlling the
meaning of the symbols used as attribute values in your ODD (using a
<valList>) and indeed for documenting the language from which you drew
It's interesting to note that one of the very first major controversies
in the TEI concerned whether or not to permit attributes at all. The
chair of the nascent metalanguage committee in fact resigned over this
issue in 1989 or thereabouts. I sometimes wonder whether she'd have a
wry chuckle at the way history has (partially) vindicated her.
On 08/04/11 07:47, Piotr Bański wrote:
> Hi Stuart,
> Half alive after a 15-hour transfer across the Puddle I can't resist
> mentioning that you've apparently just demonstrated some horrible
> short-sightedness on the part of the inventor(s) of xml:lang -- how can
> one force us to at the same time declare the language *unconditionally*
> for *both* element and attribute content?? Think of dictionaries.
> Some part of my brain has a memory of something like xml:lang pertaining
> to element content alone, and of attributes not being addressed by it.
> This memory is clearly wrong in the light of the recent quote from the
> XML Spec. But is another memory, of the controversy between switching
> from using @lang to @xml:lang, not related to that? Was @lang (of P3?)
> meant for element content alone perhaps? I do hope I am missing
> something here.
> Because if what you say is as true as it apparently is, it's not really
> a matter of Lou being right or wrong, it's a matter of what attribute
> values you are theoretically allowed to use on any element that contains
> a string in a language that you want to identify. Your example concerned
> @n, but isn't the same logic applicable to e.g. @type then? (etc. --
> even if one tries to wiggle out of my question by saying that @type is
> symbolic, it doesn't matter because xml:lang may also be about the
> script, not just the language).
>> [Sorry if you have already received an email similar to this, I'm having
>> email issues at my end.]
>> I have come to realise that Lou is right about this.
>> Even in Piotr's minimal case, xml:lang already has a meaning and a
>> meaning that matters in the real world:
>> <linkGrp xml:id="...">
>> <ptr xml:id="..." target="..." type="..." xml:lang="pl" n="a"/>
>> <ptr xml:id="..." target="..." type="..." xml:lang="sw" n="b"/>
>> The language of the @n attributes 'a' and 'b' are determined by their
>> respective @xml:lang attributes. If systems potentially use @n
>> attributes for collation or display (as we do at the NZETC), then
>> language of the @n attributes matters.
>> Thus, this is not a case where unspecified meaning in the standard can
>> be exploited to stash the language of the referent.