Print

Print


Sylvain Loiseau wrote:

> For annotating this kind of information I use NMTOKENS in the @ana
> attribute. For instance
>
>     <w ana="pos.art g.m n.s">le</w>
>
> This is not pure P5 I suppose since my tokens are not "xsd:anyURI"...

It sure ain't. This is straightforward attribute abuse. Plainly in P4 it
wouldn't even validate, since there are unlikely to be any matching IDs. In
P5 schema-based validation, because "pos.art g.m n.s" just might be a
whitespace-separated list of names of  files in the current filesystem, it
can get past validation as a sequence of URIs, but that's no justification
for fooling the validator in that way. Successful validation per se is never
enough. And if it's achieved at the cost of violating the plain intended
semantics of elements or attributes it's worse than useless. The value of an
ana is meant to *point to* an element or elements containing the
interpretive information. In P4 that pointing is done via id-idref(s), in P5
it is done by Xpointer(s). Anything else is just plain wrong. If you want to
use attributes whose in situ values have an intrinsic semantics expressing
the grammatical or other categories you want to assign to the <w> then fine:
just extend the scheme to include an attribute to do that job and tell us
about it in your teiHeader. Though it would be preferable to put your
interpretive information into suitable elements and use ana to point to
them, since that works out of the box. But, as with War and Peace as the
"valid" contents of a <c>, the mere fact that something validates in a case
where the validation mechanism isn't capable of  enforcing what the
Guidelines clearly say doesn't turn abuse into conformance.

Michael Beddow