Print

Print


I think it most cases non-transcript text is most like a footnote or
endnote which humans have been able to parse as standoff markup for
centuries now.  It's much more readable if we use anchor tags or give
every element an xml:id for reference, but could be accomplished
without either of these intrusions simply by xpath-ing from the root:

e.g.:

//TEI/TEXT/DIV[3]

Doug


2011/8/29 Christian Wittern <[log in to unmask]>:
> Good evening Jens,
>
> On 2011-08-29 15:30, Jens Østergaard Petersen wrote:
>>
>> Thank you for referring me to the historical precedents - it is comforting
>> to know that these issues have been raised before and that a conclusion has
>> been reached. I wonder if the discussion you refer to took place in May 2003
>> - the thread "Using attributes to record data"
>> at<http://listserv.brown.edu/archives/cgi-bin/wa?A1=ind0305&L=tei-l>?
>> However, the handful of postings that touch on the topic of attributes
>> versus "prose" hardly merit the description "war," so perhaps the discussion
>> did not take place on the mailing list?
>
> The discussions started in the list of the working group on character
> encoding (that list was hosted in Oxford -- no idea if the archives are
> available and/or searchable) and was later carried further on the Council
> list (archived here:
> http://lists.village.virginia.edu/pipermail/tei-council/).
> If I remember correctly, the introduction of <choice> was also one outcome
> of these discussion and paved the way for erroding the distinction you
> mention.  The expression "war on attributes" was actually used rather early
> in the discussion, but if you search TEI-L for that, you will find only
> later reminiscences and memories, not the discussion itself.
>
>> I would still claim that storing interpretation in attributes is a guiding
>> principle of TEI, in the sense that this is this is the first place one
>> looks and that text nodes are used only if this proves patently impossible,
>> as in<*Desc>  elements. The existence of these exceptions was part of the
>> problem I raised (touched upon in 2003 as well, I see). My claim is that
>> expressibility is needlessly hampered by this.
>
> The problem I have with that approach is that as long as you envision TEI
> source code as something created and curated by actual humans, rather than
> something written and read by programs (like for example, the HTML code
> produced by M$ Word) this is a slippery slope to take.
>
>
>> Stand-off markup is usually considered in case of overlapping element
>> hierarchies: this is the first time I have seen it advocated because of
>> "overlapping attributes."
>
> Stand-off markup is a general principle that can be used to solve a lot of
> use cases, the general notion of "annotation" being one of them.  This is
> especially useful if you have multiple people doing annotations (at which
> point you almost automatically arrive in the land of overlapping hierarchies
> of whatever nature).
>
>> But whereas overlapping hierarchies are a problem of XML as such,
>> overlapping attributes are only a problem of the TEI application of XML (the
>> choice between text and attribute nodes). Anyway, stand-off markup is hardly
>> known for ease of processability: there is nothing I would like to do more
>> than employ this approach, but I feel that, in the absence of standard
>> procedures and tools, it is more a problem than a solution.
>
> Well said.  But you could also use this as an argument for the need of
> tool-development....
>
>> Note also that stand-off markup can be used to hide the fact that a schema
>> does not represent what it claims to represent, i.e. it can stand in the way
>> of getting at the root of a problem.
>
> Or showing you the path to the solution of the problem, in so far as it
> shows you where your schema has failed.
>
> All the best,
>
> Christian
>
> -- Christian Wittern, Kyoto
>