Print

Print


Dear Christian,

Thank you for referring me to the historical precedents - it is comforting to know that these issues have been raised before and that a conclusion has been reached. I wonder if the discussion you refer to took place in May 2003 - the thread "Using attributes to record data" at <http://listserv.brown.edu/archives/cgi-bin/wa?A1=ind0305&L=tei-l>? However, the handful of postings that touch on the topic of attributes versus "prose" hardly merit the description "war," so perhaps the discussion did not take place on the mailing list?

I would still claim that storing interpretation in attributes is a guiding principle of TEI, in the sense that this is this is the first place one looks and that text nodes are used only if this proves patently impossible, as in <*Desc> elements. The existence of these exceptions was part of the problem I raised (touched upon in 2003 as well, I see). My claim is that expressibility is needlessly hampered by this.

Stand-off markup is usually considered in case of overlapping element hierarchies: this is the first time I have seen it advocated because of "overlapping attributes." But whereas overlapping hierarchies are a problem of XML as such, overlapping attributes are only a problem of the TEI application of XML (the choice between text and attribute nodes). Anyway, stand-off markup is hardly known for ease of processability: there is nothing I would like to do more than employ this approach, but I feel that, in the absence of standard procedures and tools, it is more a problem than a solution. Note also that stand-off markup can be used to hide the fact that a schema does not represent what it claims to represent, i.e. it can stand in the way of getting at the root of a problem.

Best,

Jens

On Aug 27, 2011, at 3:56 AM, Christian Wittern wrote:

> On 2011-08-26 23:15, Jens Østergaard Petersen wrote:
>> On Aug 26, 2011, at 2:54 PM, Martin Holmes wrote:
>> 
>>> I thought what he was basically saying was that _if_ this separation were rigid:
>>> 
>>> - elements (in<text>  at least) contain transcribed content
>>> 
>>> - attributes contain editorial/interpretive/metadata content
>>> 
>>> then indexing and searching the original text would be much simpler.
>> Hi Martin,
>> 
>> I try to guess at the design principles behind TEI, seen as an XML application. To me it looks as if this separation is one of the guiding principles, though I don't believe I have ever heard anyone say so. Following discussions on the list, it appears to be second nature to the TEI community.
> 
> The above was more or less the case up to P4, after which we had the "war on attributes" (ca. 2002; the attributes lost), nowadays in P5 there are many cases, where you can't just take the stuff that is in black in Oxygen and run with it as your text.
> 
> As Sebastian mentioned, it was at some time considered to go further down this road and allow markup to be expressed either as attribute or as element.  At that time, rather early in the development of the ODD language, the idea was abandoned (to my relief).   I think the type of things you want to express do belong in a layer different from the "text as such" and are better handled using stand-off constructs.
> 
> Christian
> 
> 
> -- 
> Christian Wittern, Kyoto
>