Apologies for my response which contains only discussion/questions. I'm curious to know how it is possible for one person to mention another person both once and twice, within the same textual source?

It seems you're building a dataset to enable so-called distant reading, a type of encoded metadata? Then why use TEI at all? The idea appears similar to work in scientometrics, where bibliometric relations may be extracted from standard metadata to express/visualize a type of citation analysis. For example I think of the work of Loet Leydesdorff, who offers access to tools for Reference Publications Years Spectroscopy (RPYS) in a private venue:

But assuming you need to use TEI, I've noted the formation of a TEI Linked Open Data (LOD) community of interest in recent years. I wondered whether the related discussion/references could be applicable?

A more extensive bibliography appears with accompanying theoretical discussion in this JTEI article (Formal ontologies, linked data, and TEI semantics) by Fabio Ciotti and Francesca Tomasi:

I hope you'll post again later with more information about your project/solution. I'm contemplating an archival project to encode a collection of letters using TEI, and am curious about how best to encode actionable person relations as they arise across the collection. One aim would be to contextualize named person entities such that a network graph might be extracted and visualized when exposed to suitable downstream tools.

All best,
Grace Wiersma

Antonio Rojas Castro wrote:
[log in to unmask]" type="cite">
Hello List,

I am encoding information derived from letters - rather than encoding the texts themselves. 

I am interested in representing mentions and how many times the author of the letter is mentioning each named entity. I am currently using <relation> to encode these pieces of information along with <person>:

<relation active="#person_8726289" name="mentions" passive="#person_000001 #person_000002 #person_000005 #person_66462281 #person_66806872" ana="1" source="#carta_es_0001"/>

<relation active="#person_8726289" name="mentions" passive="#person_66806872" ana="2" source="#carta_es_0001”/>

In this case, both <relations> have the same author recorded with @active and several “targets” recorded with @passive. In the first element <relation> those people were mentioned only once - thus I am storing that information using @ana. In the second element the author mentions one of these authorities (they are mostly Latin authors) twice - thus, I used @ana=“2”. Both elements have the same source (@source=“carta_es_0001”) because they are “facts” contained in the same letter. However, I am not very happy with the use of @ana to represent the frequency or the weight of the relation. 

Does anyone know an alternative or can share her/his experience/opinion?

(I do not like either using @name to store action verbs like mentions or quotes, but this is the closest way I found in the TEI to emulate RDF or standoff markup).

Thank you for your feedback. 


Dr. Antonio Rojas Castro
Post-doctoral Researcher, Cologne Center for eHumanities
Editor, The Programming Historian en español