I've more or less been following the discussion on the BUTTERFLY, and
although the debate seems to have abated now, I would like to add some
general comments, as I've also had a few problems with "linebreaks and
TEI".
If what I'm going to say has been said before, or is obvious to everyone,
accept my apologies.
The thing that especially stuck in my mind was Lou Burnard's remark on
> the TEI's first commandment, viz "Thou shalt have no other markup
> scheme beside me". In general, overloading characters (especially
> things like vertical bars which dont travel well) ought to be avoided
> in documents intended for interchange.
It may, I think, be possible to defend the use of, for example, vertical
bars to mark the linebreak in *specific* cases, and to show that this is not in
violation of Lou Burnard's First Commandment.
[As for the rest of the quoted paragraph, I'm certainly not arguing
against (for?) overloading characters. However, the
problem that vertical bars don't travel well, is a minor one, as naturally a
character entity (brvbar, I suppose) could be used instead.]
My argument would be that linebreaks, in specific cases, aren't a part of
TEI proper at all and consequently any "markup" used for these specific
linebreaks doesn't violate TEI rules.
Let's first consider the following scenario:
You are transcribing the printed edition of a manuscript text. Now you
come across the following:
in the text:
... the [3] ...
Ie, there's a word with a footnote attached to it.
The footnote reads:
[3] the | the _MS_.
[If you can't see a vertical bar between the two _the_'s because it's not
properly displayed, please suppose there is one.]
What you've got here is a dittography (inadvertent repetition of a letter
or word) occurring across a linebreak, quite a common mistake in manuscripts
and prints.
The editor has corrected the mistake and gives the original reading in
the footnote, using the vertical bar as a sign for the linebreak (and has
explained this convention in the introduction of his text, hopefully).
Now you transcribe the above, perhaps insert the note in the text, or
wherever, link it to the word to which it is attached.
How would you transcribe the content of the footnote? You'll probably
keep it as it is, possibly using a character entity for the vertical bar.
You're not going to translate the *content* to TEI. So:
<note>the ¦ the <hi>MS</hi>.</note>
If you basically agree, read on. If you don't you may wish to stop here.
Different scenario:
You've got a manuscript in front of you and directly transcribe it using
TEI. You're in fact editing directly in TEI.
The important thing is: there are still the same two levels as in the
first scenario - your edition of the text, and the transcription of the
selfsame edition in TEI. Your TEI file so to speak doubles as edition and
TEI-transcription. There's a primary level (edition) and a secondary level
(TEI) [or, if you take the source text as primary, a secondary and a tertiary
level; never mind].
Now, suppose you come across the same problem as the editor in the first
scenario, a dittography occurring across a linebreak, and you need to
refer to the linebreak in your correction.
You may get away with using <lb> if you use SIC to mark this up and use
CORR as its attribute, ie
<sic corr="the">the <lb>the</sic>
That is, if you've kept the linebreaks everywhere, which you may not have
done.
You definitely can't use <lb> if you turn SIC and CORR around. Not only
would <lb> not be recognized as a TEI element within the value of the SIC
attribute, it would also be utterly wrong for very basic reasons.
[Wrong: <corr sic="the <lb>the">the</corr>]
!!! This is basically wrong because the linebreak in the example is no
linebreak !!!
There's no line to be broken here. It's not a linebreak proper, an
instruction to start a new line. Even if TEI did recognize <lb> as a TEI
element within the attribute value, you certainly wouldn't want a
real linebreak to suddenly jump in your face in this case.
The linebreak in the example is, as it were, a linebreak in inverted
commas, a meta-linebreak. You're TALKING ABOUT a "linebreak", you're not
using one.
>>So, there are linebreaks and there are meta-linebreaks, and the
distinction is important.
The linebreak changes its status as soon as you talk about it.
And to be able to talk about it, write about it, refer to it in the
succinct way editors do, you need a sign for it. Choose any, but a vertical
bar is common practice.
>>So, a meta-linebreak may need a sign.
When you talk about the linebreak and determine a sign for it, you do so
in your function as editor. Thus the sign chosen is part of your edition,
your editorial conventions, part of the primary level of your
edition-cum-TEI-transcription file.
>>The sign is editorial.
Thus, this 'meta-linebreak', like the letters of the text, does not belong
to the TEI-level - whether or not the edition exists in printed form (or
just in your head). It is character data.
>>The sign has nothing to do with TEI.
Consequently, something like this, I would argue,:
<corr sic="the ¦ the">the</corr>
would not violate Lou Burnard's First Commandment. You're not really using
"foreign markup". You're using "markup" alright, of a kind, but on a
different level than TEI. It doesn't compete with TEI.
(Does the Spanish N with a tilde as the "character markup" for a palatal
nasal compete with TEI, being foreign "markup"? I suspect it doesn't.)
The situation, with those two levels (editorial here, TEI there) gets
admittedly a bit muddled, as TEI does provide some editorial markup of its
own. But then TEI contains different types of markup, structural,
editorial...
But not for the 'meta-linebreak'.
And there's no need for it really, either, is there? Unless someone
would want to standardize, or at least give some recommendation.
Naturally, everything said about the linebreak applies mutatis mutandis to
pagebreaks, columnbreaks, etc.
As I was saying before, if all of this has been absolutely obvious, accept
my apologies for wasting your time.
Ingo Mittendorf
University of Cambridge, Department of Linguistics
|