The project I'm working on <http://jewishliturgy.googlecode.com> intends
to encode Jewish prayer books ("siddur," sing., "siddurim," pl) in TEI,
and make the texts freely available with a free software framework to
represent the texts in printable formats. The resulting texts are
intended to be used by individuals and congregations, and the majority
of the end-user audience will likely be non-academic. I'm currently
working on the project's schema and encoding guidelines.
One of the problems we're encountering is that "the siddur" is not one
book. It's more like a lot of separate books with large amounts of
overlapping content. In its current forms, there are multiple major
rites, and, even within the major rites, there are minor textual
variations. There are also a large number of repeated texts with minor
differences where which variant should be used depends on the function
of the text at its particular place in the service or the time of the
year the text is being used.
Ideally, I would like to represent the text with as little repetition of
core text as reasonably possible.
An example is the "Kaddish" <http://en.wikipedia.org/wiki/Kaddish>,
which has 5 (or 6) major functional variants:
1. "Half kaddish" - used as a separator within services (shortest
version, contains the first 2 paragraphs)
2. "Mourner's kaddish" - said by mourners during or at the end of the
service (half kaddish+2 lines)
3. "Rabbi's kaddish" - said after a section of study included in the
service has been read (mourners kaddish+1 paragraph)
4. "Full kaddish" - said to complete a prayer service (mourners
5. "Burial/completion kaddish" - said either at funerals or at the
completion of study of Talmud (full kaddish+multiple additions)
[A possible #6. "Reform mourner's kaddish" - traditional mourner's
kaddish + 1 paragraph]
Within those variants, there are additional variants that are time
dependent, such as the repetition of 1 word in the second paragraph or
the modification of 1 word in the final line; and some variants that are
dependent on rite (for example, one line in the first paragraph). In
addition, there are numerous other rite variants in the text.
As far as I can tell, none of the stock TEI mechanisms for text
versioning can represent this level of detail(?).
I wrote a proof of concept version as a TEI extension with 5 new
elements (in a separate namespace, prefixed here "j"):
j:txtVar / j:txt (essentially a rename of tei:app / tei:rdg ) -
variation in text
j:tempVar / j:temp - [temporal] variation in when a text is used
j:funcVar / j:func - functional variation
j:custVar / j:cust - variation in custom (usually for instructions)
j:varVar / j:var - variation that doesn't fit into any of the above
I'd been thinking a bit about how to link each variant to its source.
The options (as far as I can tell) are: @wit attached to tei:witness
(like tei:rdg), a new attribute, or @decls attached to tei:sourceDesc or
tei:sourceDesc/tei:biblStruct. @decls seems like the most "portable"
way to do it because then the same processing could be done on the
variants as is done on tei:text or tei:div to link to sources.
The next problem is how to link each variant to a user's (or a text's)
selections - and here's where I really might be getting into borderline
TEI-abusive territory. I defined three new attributes: @j:set,
@j:include and @j:omit. @j:set is a global attribute that allows a
variant to be selected as active in the (similar to @select from linking
module). @j:include and @j:omit can appear on tags such as the variant
tags mentioned above, and define the conditions under which the tag
should be included in a text. When conditions are evaluated by
@j:include or @j:omit, the values from the list of URIs are combined by
logical or. The result of testing the settings by @j:omit overrides the
result of @j:include if @j:include returns true. All of the conditional
attributes take a list of URIs as their content that point to TEI
Most of the switches are 3-way switches. The possible feature values
are: YES, NO and MAYBE. When YES is selected, conditional text should
always be included. When NO is selected, conditional text should always
be excluded. When MAYBE is selected, conditional text should be
included, and any instructional text associated with the conditional
text should also be included. For example, in the Kaddish, there's a
word variant that is only included between the two high holidays of Rosh
Hashanna and Yom Kippur. A (simplified) example of the system would be
this (excluding the documentation elements):
<!-- in the file header, define (and document) the conditional: -->
<tei:symbol xml:id="YES" value="YES"/>
<tei:symbol xml:id="NO" value="NO"/>
<tei:symbol xml:id="MAYBE" value="MAYBE"/>
<!-- somewhere in the file: -->
<tei:fs xml:id="BetweenRHandYK_Y" type="time">
<tei:f name="BetweenRHandYK" fVal="#YES" />
<tei:fs xml:id="BetweenRHandYK_N" type="time">
<tei:f name="BetweenRHandYK" fVal="#NO" />
<tei:fs xml:id="BetweenRHandYK_M" type="time">
<tei:f name="BetweenRHandYK" fVal="#MAYBE" />
<!-- the division specifying the possible times: -->
<!-- the text under variation: -->
<!-- this is a bit overspecified to show everything being used: -->
<j:temp j:include="#BetweenRHandYK_Y #BetweenRHandYK_M"
<tei:note type="instruct" j:include="#BetweenRHandYK_M">Between
Rosh Hashanna and Yom Kippur, add:</tei:note>
<!-- text to be included conditionally here -->
There are quite a few more complicated switches, involving combinations
of switches and the application of "and" and "or" logic to figure out
whether a particular text should be included.
Another option would be not using feature structures and defining a
separate 3-way switch system specifically for the purpose.
Has anyone ever faced a problem like this? How does this solution
compare to something that might be developed by someone with more TEI
experience? Does it fit within the "TEI abstract model" or is it abusive?
[log in to unmask]