Further to Francois Lachance's demand for further tagging and working with
IDs and IDREFs (in connection with references to linebreaks as textual
environment, in corrections etc.).
On Fri, 20 Jul 2001, Francois Lachance wrote:
> Needs more than a sign (as a "standing in for")--- it needs tagging.
> Such tagging can take the form of link encoding.
> The way you can refer to an element is by an ID/IDREF mechanism.
> Your <lb> element is given a unique identifier. Your editorial note can
> target that identifier.
Working with the ID/IDREF mechanism in this case is out of the question.
As I was saying before, I'm not interested in a single particular
linebreak, I'm interested in "a linebreak as textual environment".
So, we are dealing with a group of entities here, not a single entity in
For an identification you need a single, unique and therefore identifiable
This is an identification:
X.Y. is the editor.
In terms of reference-to-linebreaks-as-environment there is no single,
unique, identifiable entity.
Instead you deal with (ultimately anonymous) members of a group of
entities that share a common feature (or a group of common features); this
feature defines this group.
The best you can do in such a case is give a definition:
A is the first letter of the Latin alphabet.
This statement applies to all instances of the letter A, wherever you find
one - not to a single, particular letter A.
A definition has to be unambiguous. Otherwise you just have a description:
A is a letter.
(If you add more descriptive features here, until you reach the level of
unambiguousness, you again get a definition.)
A linebreak-as-environment clearly is the anonymous member of the group of
linebreaks-as-environment. No identification possible, therefore no
ID/IDREF mechanism possible.
You can give a definition though. (Or perhaps this is called "declaration"
in TEI - ?)
As the linebreak-as-textual-environment is textual, and you need to refer
to it within the attribute value of SIC, which takes CDATA, you by
necessity end up with a character entity to represent the
linebreak-as-environment. This character entity must be unambiguous and it
may have to be defined (declared ?).
Definitions are not exactly foreign to TEI. The DTD is full of them. The
DTD itself is one.
When reading the discussion on "butterfly" I came to believe that the
'vertical bar solution' (vertical bar representing a
linebreak-as-environment) was *fundamentally* wrong.
But I don't think it is, not fundamentally; but perhaps 'accidentally'.
The vertical bar
- may be ambiguous.
- may not be sufficiently defined TEI-internally.
This is the problem, not something with further tagging and ID/IDREF
Alright, then let's create a character entity "&lb;" (very unlikely that
you'll ever come across an l-b-ligature, and it can be declared as an
entity within TEI).
Would that satisfy the experts?
Another approach, and I would like to ask the experts not to outrightly
reject it, but think about it for a moment:
Just use, within the attribute value of SIC, the character sequence
Opening-Angular-Brackets + L + B + Closing-Angular-Brackets
ie: <lb>. A correction would look like this:
<corr sic="the <lb>the">the</corr>
I've rejected this myself before, but:
<lb> is unambiguouos. It is already (kind of) defined within TEI. Of
course <lb> will no longer be analyzed as a TEI element, but as character
data. But that's what we want, character data. A "de-activated" linebreak.
It's just an idea.
Either way, "&lb;" or "<lb>", there'll be a problem if you ever wanted to
copy (move) the attribute value of SIC to an area with 'mixed content' of
CDATA and TEI-elements (convert the attribute to a note or turn CORR and
SIC around). Using "<lb>" you may end up with active linebreaks where you
want a non-active linebreak; using "&lb;" it's the same problem, just the
other way round.
Will be grateful for your assessment.
Univ. of Cambridge, Dept. of Linguistics
[I'm not going to be in next week, so I won't be able to reply