>> I would like to include in the basic architecture the ability to
>> reference parallels. Eventually, I want to be able to compare every
>> instance of a canonical work. E.g., there might be ten MSS of some
>> chapter of a biblical text that I want to look at all at once. It seems
>> to me that in order to do this, I need to incorporate a means to
>> consistently name parts of MSS that refer to the same canonical
>> reference (e.g. Matt 5:9). This is where I slipped into the slough.
> Can you please clarify this: are you proposing to encode 10 separate
> transcriptions of 10 papyri of the same work and then by means of
> <milestone> tags somehow overlay OR display side by side the resulting
Yes, there will be separate transcriptions of the same canonical text.
The text might be of Menander or Homer or whatever. There might be tens
or even hundreds of texts which cover the same part of the same work.
Whether I should use milestone tags or, say, divs with canonical
references as IDs is what I want you to tell me. Whether I take notice
of what you say is another matter !-)
The method of presentation is yet to be worked out. Nevertheless, I know
that I will want to be able to align texts from different TEI documents
by means of canonical references such as "Matt 5:9".
>> The competing hierarchies problem is this. The manuscripts are encoded
>> as lines of text from a column of a roll or from a folio of a codex.
> I don't get what you mean here. The only hierarchy is of your own
> making. This just sounds like text with embedded newline and new page
> markers. I would avoid defining hierarchies where they are not required
> because it might later lead to just such a "competing hierarchies"
The problem is that there _are_ competing hierarchies. One is the
hierarchy of the particular instance of the work, as we now possess it
in the form of a MS. This hierarchy might be represented as (MS ID,
folio no, side, line no) or (MS ID, side, column no, line no). Scholars
reference particular MSS in this way, so my TEI file needs to do the
same. The second hierarchy relates to the Platonic ideal behind the MS
instance--the work itself. This often has a canonical hierarchy such as
(Bible.Matthew.chapter005.verse009). Some potential users will want to
refer to the texts in this way.
>> What is more, I want to be able to look at what the scribe
>> wrote (let's call this the scribal view) or at an orthographically
>> levelled version (let's call this the substantive view).
> What do you mean by an "orthographically levelled version"?
This is a version in which a scribe's spelling peculiarities have been
levelled. E.g. "The chat sat on the mat" would be levelled to "The cat
sat on the mat." It is a very interesting exercise to compare MSS of the
same text firstly from a spelling perspective (e.g. group MSS according
to how particular words are spelled) and secondly from a substantive
persective (e.g. group MSS according to which words they use at places
of variation among the MSS). To do the second properly, you need to
level orthographic differences.
>> Rather, I
>> think that it is necessary to separate the Leiden part (lines of text
>> with markup identifying gaps, doubtful letters, editorial activity)
>> the canonical part (lines of normalised text arranged in a canonical
> I don't like the sound of this. Isn't the canonical part just that of
> the underlying NT text, and the "Leiden part" the transcription of the
> papyrus? My first inclination would be to encode them as one so that
> the text of the papyrus was nailed onto the underlying canonical text.
> This would avoid the duplication that obviously arises otherwise.
I don't want to get too philosophical, but to which underlying text do
you refer? All we have are versions of it. We may have standard
references that point to sections of the text. However, there is no
guarantee that we know what words to put there. (We might argue about
where divisions in the canonical referencing scheme take place, but that
is a headache for another day.) Even if you choose some standard version
of the text as your base, you still have the competing hierarchies to
Here is a typical problem. My eight lines of text contains a few
canonical divisions. It also contains a few orthographic peculiarities
for which I want to indicate standardised spellings. It also contains
words that are divided across lines. How do I encode a place where a
word that has peculiar spelling is divided across two lines so that I
can do the following necessary tasks at a later point? (No need to
suggest answers. I just want to let you know the nature of the problem.
I am working on a solution and will post it to this group at some point
in the future, DV.)
(1) Present the text as it appears in a conventional edition with folio
and line numbers and some words divided across lines.
(2) Present the text with standard spelling using canonical reference
Throw in correctional activity and the complications get worse.
All that aside, I did try the approach you suggest. As I said in the
original post, it didn't look pretty. I abandoned that approach because
I did not think that someone who does not understand hierarchies, XML
and TEI particularly well would succeed in transcribing the relatively
simple example that I am using as a test case.
> I think the answers to your questions rather depend on the kind of
> output you envisage producing. That shouldn't be so, but a single
> amalgamated text containing all versions including the base
> (canonical?) text, lacunae, corrections, emendations etc would be so
> fantastically complex as to defy conventional editorial techniques.
Yes. The single amalgamated text is complex, which is why I am thinking
of separating out a "conventional" view (e.g. MS ID, folio, side, line)
from a "canonical" view (e.g. Work ID, primary div, secondary div, ...)
and presenting both in each TEI rendition of a MS.