One more question/nitpick. You say:
> "#p-acp" is no valid pointer (no valid URI)
Well, it is not, but it's a valid fragment identifier (see ), and
somewhere in the maze of W3C specs, there is a statement on interpreting
bare fragment identifiers as being virtually appended to the URI of the
current document, yielding a correct (longer) URI. So I think that you
are fine, syntactically (or have you actually got a failed validation
result? I'd be very curious to see a test case then), but obviously not
semantically (we address this "pretend that POS values are fragIDs, just
for the sake of the tei.pointer datatype" issue in the text of the
github ticket to which I pointed you, alongside other arguments against
using @ana for this purpose).
On 01/05/18 16:39, Piotr Bański wrote:
> Dear Paolo,
> Please have a look at the proposal addressing this at
> It avoids the "POS-in-@ana" issue, and provides arguments for that.
> You will also see there a list of projects that use the proposed
> format, some of them based on MorphAdorner.
> The practical question for you now, I guess, is either to keep the
> existing TEI skeleton and disobey the @ana datatype or adopt the
> changes we have suggested in the ticket and put the POS information
> where it belongs, hoping that the Council will address the issue
> before the end of the world. It's a gamble... :-)
> Best wishes,
> On 01/02/18 21:11, Paolo Monella wrote:
>> Dear all,
>> I ran a lemmatizer/PoS tagger (TreeTagger) on a TEI P5-encoded file
>> and want to encode the result in attributes of <w>.
>> I searched the TEI-L archives and the Internet. I found that
>> MorphAdorner  uses @lemma for lemmata and @ana for the PoS output
>> (e.g. "adjective, positive genitive plural masculine"):
>> <w lemma="in" ana="#p-acp" reg="in" xml:id="A88624-000740">in</w>
>> I had tried this encoding:
>> <w ana="4-S--------" lemma="in" n="in" xml:id="w315">in</w>
>> The main difference is that MorphAdorner prepends a "#" to the value
>> of @ana because this value should be a teidata.pointer .
>> In any case, also "#p-acp" is no valid pointer (no valid URI), so do
>> you think I should leave my encoding as it is, or prepend "#" as in
>> Thank you,
>>  See paragraph "Simplified TEI P5-like output" in