Alright, Francois, never mind the meta and ontological mix of levels, I
didn't mean it to be more than a label. Let's call it something else,
linebreak and reference-to-a-linebreak.
I hope that we may at least agree that there is some such distinction;
and that linebreak here is no irrelevant layout feature.
Some of the more peripheral stuff first:
As for the problem that you may end up with a "sign" that is no longer
unambiguous because in your 34th text out of 35 you come across it in
your text, this can surely be solved. And I don't expect to find any
vertical bars in those texts I'm dealing with. I'm well aware of good
practice and keeping things unambiguous. If nothing else helps, create
some unambiguous entity.
As for the processor not being able to understand what the attribute value
is supposed to mean: Doesn't the SIC attribute contain CDATA? Perhaps I'm
naive, but the reference-to-a-linebreak thing would be read as CDATA as
well. The exact nature of a mistake, its possible cause is for the human
user to determine. I'm not interested in any highly sophisticated stuff
here, I just want to record the manuscript reading, which, unfortunately,
may include a linebreak. The issue of ambiguity aside, any other
Now the central issue: link encoding and referring to an element by
I'm absolutely not clear how this would work in cases like the one I
mentioned, dittography (... the <lb> the ...), or with omitted hyphens
(... link<lb>ing ...).
I would like to point out that I don't want to refer to any individual
linebreak, but to "linebreak" as an environment in general. It's the
environment that counts, not any linebreak in particular. Even if working
with ID/IDREF stuff, you would ultimately need something that stands for
"reference-to-linebreak-as-environment". What other way is there than to work
with a sign, a character entity or whatever - some succinct shortcut for this?
If it helps, define it somewhere. But this isn't a lot different from the
vertical bar solution in the end, where there's admittedly a slight danger of
ambiguity and which may rely on the general knowledge of the user (and even
here you could, in your section on editorial policy, make things explicit).
I feel unable to view the problem from any other angle. You create an
object, lable it, define it, use it.
What's the problem with using or creating a character (entity)? Ambiguity
can be avoided. Definitions can be given. Nothing insurmountable there.
If there's a fundamentally different approach, I can't see it. But then
I'm no expert on the inner mysteries of TEI.
I have to admit that I'm slightly worried how complicated this gets.
Errors occurring because of linebreaks are run-of-the-mill phenomena. I
haven't come across a single solution that would satisfy me. If I seem to
have found a way out, it equally soon gets blocked again. It can't be that
difficult, can it?
TEI is supposed to be easy to use. Tractable for the average user. Most of
it is, although it often doesn't show. But in some areas you can't help
but feel like in a city where the streets have no name.
So, I need a solution, officially approved and showing the stamp.
If possible, in the forseeable future. After all, there are also some
transcriptions to be made.
The task is: reference to a linebreak as an environment within attribute
values, especially SIC and ORIG.
Condition: Must be simple to use. No more than lower average complexity.
Not time-consuming. Must be convertible to something printable. Preferably
in line with the TEI rules.
On Fri, 20 Jul 2001, Francois Lachance wrote:
> I don't know about all readers of the TEI list but for me a Metalinebreak
> discourse would be a discourse about linebreaks rendered in linebreaks
> (just as metalinguisitic refers to the use of verbal artefacts to
> describe verbal artefacts)
> > > <sic corr="the">the <lb>the</sic> >
> > The linebreak changes its status as soon as you talk about it.
> ontological mix of levels here related to the distinction between
> "mention" and "use"
> > And to be able to talk about it, write about it, refer to it in the
> > succinct way editors do, you need a sign for it. Choose any, but a vertical
> > bar is common practice.
> > >>So, a meta-linebreak may need a sign.
> > When you talk about the linebreak and determine a sign for it, you do so
> > in your function as editor. Thus the sign chosen is part of your edition,
> > your editorial conventions, part of the primary level of your
> > edition-cum-TEI-transcription file.
> Needs more than a sign (as a "standing in for")--- it needs tagging.
> Such tagging can take the form of link encoding.
> The way you can refer to an element is by an ID/IDREF mechanism.
> Your <lb> element is given a unique identifier. Your editorial note can
> target that identifier.
> > >>The sign has nothing to do with TEI.
> > Consequently, something like this, I would argue,:
> > <corr sic="the ¦ the">the</corr>
> > would not violate Lou Burnard's First Commandment.
> Whether violation there is or not, how is a processor to distinguish what
> the attribute value is supposed to mean? What if the manuscript you are
> transcribing contains an instance of the symbol you are using to indicate
> the feature you are encoding (in this case a bar (symbol) and a linebreak
> You may wish to review the archive for a discussion of attribute grammars.
> One by Wendell Piez in relation to the mark up of frames and divisions.
> One back in March 2000 by Gregory Murphy is most useful and succinct
> reminder of good practices...
> Francois Lachance, Scholar-at-large
> 20th : Machine Age :: 21st : Era of Reparation