I wonder if TEI does not paint itself up in a corner by relying so much on attributes ….
The design principle of TEI, as an XML language, appears to be that text nodes represent the text to be encoded and structural markup and attributes represent interpretation of the text. This would be fine if all interpretation could be fitted neatly into this mould, but there are exceptions, even common ones like figDesc: here the text node manifestly does not represent "what is there in the text" (in this case an image), but the encoder's interpretation of it. Evidently, figDesc is used for something more expressive than is possible in an attribute ….
(This leads to a loss in decodability: when we index our texts, we have to exclude elements like figDesc - otherwise our users will think that the fancy terms we employ there occur in the sources.)
Attributes have poor expressibility compared to the elements they reside on. Attributes can repeat, but doing so imposes syntactic restrictions, usually concerning white space. More importantly, they cannot contain elaborations in the form of nested elements - or in the form of attributes.
There is nothing that an attribute can do that an element can't do - except irritate.
Let us say we have the name "Paris" in a text. Jack thinks this is a place name, Jill thinks it is a personal name. I would claim that this is a common situation in a collaborative environment. Certitude might also be involved. Jill thinks it is likely that it is a place name, but wants to record the possibility that it is a personal name. Surely, TEI should allow you this, but packing this kind of interpretation into attributes makes it impossible.
To me, the only way forward is to give up the reliance on attributes for interpretation and explicitly marking up which text nodes represent the textual contents one deems to be "there."
There are surely other ways of doing this ….
If all attributes were to be converted into elements and all representational text nodes were to be explicitly marked, every nuance of TEI markup as we have it now could be maintained. Markup would be more verbose, to be sure, but easier to decode automatically - and above all easier to expand and be creative with.
On Aug 25, 2011, at 6:40 PM, Sebastian Rahtz wrote:
> On 25 Aug 2011, at 17:34, Martin Holmes wrote:
>> I think we could just chain together @type values (for instance), like this:
>> <persName type="labourer"> ->
>> <name type="person labourer">
>> Any problem with that? It's not lossy, really.
To me, this misrepresents the semantics of @type. @type on <name> should give a type of name, but here "labourer" is a categorisation of the person named. There are many types of name (ethnic groups, religions, brands, planets, periods of time, etc.), but I don't think there is a specific type of name for labourers ….
> we'd have to change @type to allow multiple values. which may be
> no bad thing anyway.
> Sebastian Rahtz
> Head of Information and Support Group
> Oxford University Computing Services
> 13 Banbury Road, Oxford OX2 6NN. Phone +44 1865 283431
> Sólo le pido a Dios
> que el futuro no me sea indiferente