Am 20.02.2015 um 22:23 schrieb Laurie Allen:
> Hi All,
> I'm working with a linguist on encoding some 16th through 18th Century
> texts. The first text we're working on is a book published in 1578.
> We're mostly encoding for the structure of the book, and for the
> languages used in it (it's mainly written in spanish, with Zapotec,
> latin, and some Nahuatl interspersed).
> In addition to showing the book in various ways, we are pulling in
> additional information when the Zapotec language is used in the book.
> For Zapotec terms, we join the TEI (in XSLT) to the linguistic analysis
> of the same zapotec term as exported from the linguists database. 
> All good until...
> The author of the book, in describing the Zapotec language, often says
> things that would translate to: (this is a made up example.) 
> "They say over or well cooked when meat is done cooking."
> However, the sentence would include three languages so that 
> ---"They" "say" "when meat is done cooking" would be in spanish. 
> --- "over" and "well" and "cooked" would be in zapotec. 
> --- "or" would be in latin. And it would actually be abbreviated as l.
> So, we've been encoding it as:
> They say <foreign xml:lang="cvz">over</foreign>
> <foreignxml:lang="lat"><choice><abbr>l</abbr><expan>vel</expan></choice></foreign>
> <foreign xml:lang="cvz">well cooked</foreign> when meat is done cooking.
> Now, when the linguist enters this into her database, she says that this
> section includes 
> "over-cooked" and "well-cooked" as phrases in Zapotec, which it does, in
> its meaning. This comes up a lot in the book, and it's important that
> the TEI include a matching over-cooked somewhere that can be searched
> for, and matched up with the exported linguistic analysis of over-cooked. 
> I suspect I want to be using link and ptr here, but I'm having a
> difficult time making sense of how that would work. 

The basic problem seems to be one of truncated compounds. I could
imagine three approaches to dealing with this:

* Add the “implicit” part of the compound, through <supplied> (as Lou
suggested), or through <choice>/<orig>/<reg> (since this is some kind of
* Add a completed lemma, like <foreign xml:lang="cvz"><w lemma="over
cooked">over</w></foreign>. This might make less sense, since you are
not using <w>/@lemma elsewhere.
* Add this stand-off using an interpretation mechanism, as described in
<>. E.g.

They say <foreign xml:lang="cvz" xml:id="x1">over</foreign>
<foreign xml:lang="cvz">well cooked</foreign> when meat is done cooking.
<span type="truncated compound" target="#x1">over cooked</span>

But these are really just my spontaneous ideas.


Dr. Frederik Elwert

Post-doctoral researcher
Project manager SeNeReKo
Center for Religious Studies
Ruhr-University Bochum

Universitätsstr. 150
D-44780 Bochum

Room FNO 01/180
Tel. +49-(0)234 - 32 24794