Print

Print


Hello list,

The project I'm working on <http://jewishliturgy.googlecode.com> intends 
to encode Jewish prayer books ("siddur," sing., "siddurim," pl) in TEI, 
and make the texts freely available with a free software framework to 
represent the texts in printable formats.  The resulting texts are 
intended to be used by individuals and congregations, and the majority 
of the end-user audience will likely be non-academic.  I'm currently 
working on the project's schema and encoding guidelines.

One of the problems we're encountering is that "the siddur" is not one 
book.  It's more like a lot of separate books with large amounts of 
overlapping content.  In its current forms, there are multiple major 
rites, and, even within the major rites, there are minor textual 
variations.  There are also a large number of repeated texts with minor 
differences where which variant should be used depends on the function 
of the text at its particular place in the service or the time of the 
year the text is being used. 

Ideally, I would like to represent the text with as little repetition of 
core text as reasonably possible.

An example is the "Kaddish" <http://en.wikipedia.org/wiki/Kaddish>, 
which has 5 (or 6) major functional variants:
1. "Half kaddish" - used as a separator within services (shortest 
version, contains the first 2 paragraphs)
2. "Mourner's kaddish" - said by mourners during or at the end of the 
service (half kaddish+2 lines)
3. "Rabbi's kaddish" - said after a section of study included in the 
service has been read (mourners kaddish+1 paragraph)
4. "Full kaddish" -  said to complete a prayer service  (mourners 
kaddish+1 line)
5. "Burial/completion kaddish" - said either at funerals or at the 
completion of study of Talmud (full kaddish+multiple additions)
[A possible #6. "Reform mourner's kaddish" - traditional mourner's 
kaddish + 1 paragraph]

Within those variants, there are additional variants that are time 
dependent, such as the repetition of 1 word in the second paragraph or 
the modification of 1 word in the final line; and some variants that are 
dependent on rite (for example, one line in the first paragraph).  In 
addition, there are numerous other rite variants in the text.

As far as I can tell, none of the stock TEI mechanisms for text 
versioning can represent this level of detail(?).

I wrote a proof of concept version as a TEI extension with 5 new 
elements (in a separate namespace, prefixed here "j"):
j:txtVar / j:txt (essentially a rename of tei:app / tei:rdg ) - 
variation in text
j:tempVar / j:temp - [temporal] variation in when a text is used
j:funcVar / j:func - functional variation
j:custVar / j:cust - variation in custom (usually for instructions)
j:varVar / j:var - variation that doesn't fit into any of the above 
categories

I'd been thinking a bit about how to link each variant to its source.  
The options (as far as I can tell) are: @wit attached to tei:witness 
(like tei:rdg), a new attribute, or @decls attached to tei:sourceDesc or 
tei:sourceDesc/tei:biblStruct.  @decls seems like the most "portable" 
way to do it because then the same processing could be done on the 
variants as is done on tei:text or tei:div to link to sources.

The next problem is how to link each variant to a user's (or a text's) 
selections - and here's where I really might be getting into borderline 
TEI-abusive territory.  I defined three new attributes: @j:set, 
@j:include and @j:omit.  @j:set is a global attribute that allows a 
variant to be selected as active in the (similar to @select from linking 
module).  @j:include and @j:omit can appear on tags such as the variant 
tags mentioned above, and define the conditions under which the tag 
should be included in a text.  When conditions are evaluated by 
@j:include or @j:omit, the values from the list of URIs are combined by 
logical or.  The result of testing the settings by @j:omit overrides the 
result of @j:include if @j:include returns true.  All of the conditional 
attributes take a list of URIs as their content that point to TEI 
feature structures.

Most of the switches are 3-way switches.  The possible feature values 
are: YES, NO and MAYBE.  When YES is selected, conditional text should 
always be included.  When NO is selected, conditional text should always 
be excluded.  When MAYBE is selected, conditional text should be 
included, and any instructional text associated with the conditional 
text should also be included.  For example, in the Kaddish, there's a 
word variant that is only included between the two high holidays of Rosh 
Hashanna and Yom Kippur.   A (simplified) example of the system would be 
this (excluding the documentation elements):

<!-- in the file header, define (and document) the conditional: -->
<tei:fsDecl type="time">
  <tei:f name="BetweenRHandYK">
    <tei:vRange>
      <tei:vAlt>
        <tei:symbol xml:id="YES" value="YES"/>
        <tei:symbol xml:id="NO" value="NO"/>
        <tei:symbol xml:id="MAYBE" value="MAYBE"/>
      </tei:vAlt>
    </tei:vRange>
  </tei:f>
</tei:fsDecl>
...
<!-- somewhere in the file: -->
<tei:fs xml:id="BetweenRHandYK_Y" type="time">
   <tei:f name="BetweenRHandYK" fVal="#YES" />
</tei:fs>
<tei:fs xml:id="BetweenRHandYK_N" type="time">
   <tei:f name="BetweenRHandYK" fVal="#NO" />
</tei:fs>
<tei:fs xml:id="BetweenRHandYK_M" type="time">
   <tei:f name="BetweenRHandYK" fVal="#MAYBE" />
</tei:fs>
...
<!-- the division specifying the possible times: -->
<tei:div j:set="#BetweenRHandYK_M">
...
  <!-- the text under variation: -->
  <j:tempVar>
    <!-- this is a bit overspecified to show everything being used: -->
    <j:temp j:include="#BetweenRHandYK_Y #BetweenRHandYK_M" 
j:omit="#BetweenRHandYK_N">
      <tei:note type="instruct" j:include="#BetweenRHandYK_M">Between 
Rosh Hashanna and Yom Kippur, add:</tei:note>
      <!-- text to be included conditionally here -->
    </j:temp>
  </j:tempVar>
...
</tei:div>

There are quite a few more complicated switches, involving combinations 
of switches and the application of "and" and "or" logic to figure out 
whether a particular text should be included.

Another option would be not using feature structures and defining a 
separate 3-way switch system specifically for the purpose.

Has anyone ever faced a problem like this?  How does this solution 
compare to something that might be developed by someone with more TEI 
experience?  Does it fit within the "TEI abstract model" or is it abusive?

Thanks,

-- 
----
Efraim Feinstein
[log in to unmask]
http://jewishliturgy.googlecode.com