You are right, of course, and my reply was rather too hasty.
Reading your post more carefully, I wonder if this is yet another case
<seg corresp="#b.MAT.17.22 #b.MAT.17.23">
Os soe sik in Galiläa uphoelen, sia Jesus: Doe Minskensuone
sall baule den Hännen fan den Minsken iutliewert weren. Soe
weret en dautmaken, owwer am drüdden Dage sall hoe wir upston.
Do woören soe olle bedroöwet.
<seg xml:id="b.MAT.17.22" type="verse">
Aus see enn Galilaea eromm jinje, saed Jesus to an: "De
Menschesaen woat boolt enn Mensche aeare Henj jejaeft woare,
<seg xml:id="b.MAT.17.23" type="verse">
en dee woare am doot moake, oba aum drede Dach woat hee fomm
Doot oppstone." En siene Jinja weare seeha truarich do aewa.
The only drawback with this is that the Guidelines are a bit coy
(understated) on the subject of whether @corresp="#x #y" means
"corresponds with the concatenation of x and y" or "corresponds with x
and also with y" : you need richer tagging to resolve that ambiguity.
The drawback with using @corresp for your other case, where you want to
show a passage which is similar but not identical, is that you can't
then be certain of the semantics.
I wonder also if you've considered simply using <ref> or <ptr> *within*
the <seg> ? Then you could use @cRef, which is actually designed for
these kinds of canonical references
Honour thy father and thy mother: that thy dayes may bee long
vpon the land, which the Lord thy God giueth thee.
Hmm, not sure whether that is altogether a good idea.
On 07/11/13 13:51, Christian Chiarcos wrote:
> Dear Lou,
> thank you very much for this suggestion. @key seems indeed to be close
> to my @altid, and I would like to use it for this purpose, but I
> haven't figured out a schema-conformant way to employ it. According to
> tei_allPlus.rng, key and ref seem to be restricted to elements such as
> <name>, but not be applicable to structure elements within the text.
> So, what kind of structure element would be both compatible with @key
> and roughly correspond to what Resnik rendered as <div> (bible
> chapters, could remain div in TEI) and <seg> (bible verses, most
> likely l or p in TEI, maybe span) ?
> Another idea I had was to use <link targets="..."/> and to provide
> resolvable URIs for the verses. Still not an optimal solution, because
> it does not provide sufficiently restrictive semantics, but this would
> create only one new element and one attribute for the entire set of
> cross references per line, verse or whatever, so l < r/2 new nodes in
> the document (with l being the number of elements with
> cross-references and r the number of cross-references).
> Any alternative suggestions ? Any idea how to make use of @key ?
>> On a first quick glance, it seems to me that the existing TEI @key
>> attribute has almost exactly the semantics of your "@altid". I
>> wouldn't recommend <index> for this purpose -- it's something different.
>> On 07/11/13 10:02, Christian Chiarcos wrote:
>>> Dear list members,
>>> I am currently working on a massive corpus of verse-aligned
>>> religious texts (Bibles, mostly, but also Qur'an editions) for
>>> linguistic and NLP purposes. In the beginning, I've been adapting
>>> the CES specifications Philipp Resnik developed decades ago for a
>>> similar, small-scale project (in XML, not his SGML, of course). As
>>> we have outgrown the scale of his project by lengths, it is about
>>> time to update our format to a more recent standard, and TEI might
>>> be the format of choice.
>>> Yet, there are certain aspects specific to a parallel corpus of
>>> bibles, and I was wondering how to represent them with TEI:
>>> - All bibles share the same set of verse identifiers, but
>>> occasionally, a set of verses is not translated literally, but
>>> loosely translated within a larger segment. We introduced an
>>> additional attribute altid (alternate id), a sequence of NMTOKENS,
>>> each of which represents a regular bible ID (we did not chose IDREFS
>>> because they are not defined within the document). What would be the
>>> most efficient way to represent this properly?
>>> e.g. a multi-verse segment from a Low German (Westphalian) bible (in
>>> our CES-adaptation):
>>> <seg altid="b.MAT.17.22 b.MAT.17.23">
>>> Os soe sik in Galiläa uphoelen, sia Jesus: Doe Minskensuone
>>> sall baule den Hännen fan den Minsken iutliewert weren. Soe
>>> weret en dautmaken, owwer am drüdden Dage sall hoe wir upston.
>>> Do woören soe olle bedroöwet.
>>> vs. a verse segment in another Low German (Plautdietsch) bible
>>> <seg id="b.MAT.17.22" type="verse">
>>> Aus see enn Galilaea eromm jinje, saed Jesus to an: "De
>>> Menschesaen woat boolt enn Mensche aeare Henj jejaeft woare,
>>> <seg id="b.MAT.17.23" type="verse">
>>> en dee woare am doot moake, oba aum drede Dach woat hee fomm
>>> Doot oppstone." En siene Jinja weare seeha truarich do aewa.
>>> We query with XQuery across all bibles for a verse ID to compare
>>> differences across languages and language stages. The altids are
>>> inspected if a seg with the corresponding ID isn't found.
>>> - Not only seg, but also div elements may carry the altid attribute,
>>> e.g., for non-literal poetic bible adaptations where we have
>>> chapter- or book-level alignment only, but where smaller structures
>>> (e.g., l) exist.
>>> - altid also comes in handy if we want to mark cross-references to
>>> other bible passages that contain literal repetitions, e.g. (from
>>> the 1611 King James Version):
>>> <seg id="b.EXO.20.12" altid="b.DEU.5.16" type="verse">
>>> Honour thy father and thy mother: that thy dayes may bee long
>>> vpon the land, which the Lord thy God giueth thee.
>>> <seg id="b.DEU.5.16" altid="b.EXO.20.12" type="verse">
>>> Honour thy father and thy mother, as the Lord thy God hath
>>> commanded thee, that thy daies may be prolonged, and that it
>>> may goe well with thee, in the land which the Lord thy God
>>> giueth thee.
>>> With our querying strategy, these altids will be relevant if we want
>>> to retrieve matches from a Bible where the exact verse is lost, but
>>> a near-analogon is found, nevertheless. This specific verse is, for
>>> example, also quoted several times in the New Testament, and for
>>> languages with an NT only, we would like to have these matches if we
>>> query for b.EXO.20.12 or b.DEU.5.16.
>>> In TEI, the id would correspond to an xml:id, but what would be a
>>> good strategy to preserve the altid information without creating a
>>> large overhead (as using the index element would entail) ?
>>> Thanks a lot,
>>> Christian Chiarcos