John McCaskey wrote:
> It does look to me like silence on the issue has led to different
Not for the first time.
I am left grateful that we use <lb/> so rarely,
and do not care much about preserving the rendering of the
source: so we've never worried much about whether <p> and
<item> and <l> assume a newline. But if usage is
being surveyed (and TCP usage has already been invoked), I
(1) That though we think of <p> as a semantic unit, nothing
in our practice would be discommoded by making it equivalent to
HTML <p> -- or to <ab type="paragraph" rend="block">,
overriding it, if necessary by attribute: <p rend="inline">.
(2) We also think of <item> and <l> as semantic units.
Again, we do not care much about preserving original
layout, but we would assume that both <l> and <item>,
like <p>, imply a newline unless explicitly overridden.
(3) In fact, we do use both <l> and <item> to capture
things not formatted in the source with line breaks.
E.g. lyrics printed along with music are not usually
printed in verse lines, but we prefer when possible
to 'reconstruct' the underlying <l>s, and do not care
that the output looks more verse-like than the source;
if we did care, we would override the default behaviour
of the <l>. As for lists, we are willing to capture
something like this:
An example of darapti is as follows:
All squares are rectangles; all squares are rhombs;
therefore some rhombs are rectangles.
<p>An example of darapti is as follows:
<item>All squares are rectangles;</item>
<item>all squares are rhombs;</item>
<item>therefore some rhombs are rectangles.</item>
and again do not care if it displays in a more list-like
fashion than the source.
(4) We're willing to put <l> around anything metrical,
regardless of whether it is formatted as verse, or
whether the (metrical) lines are acephalous, acaudate (anourous?),
or complete. In fact, the capture of acephalous lines
in <l> is very common amongst our texts: many, maybe
even most, Latin epigraphs begin mid-line; no matter:
if they're metrical they go in <l>.
<l>Custodes, aut aere domat: tunc corpore sano</l>
<l>Advocat Archigenem onerosaque pallia jactat.</l>
(5) If our source numbers its lines in a prose text,
we treat those numbers exactly as we treat similar
marginalia (e.g. the reference letters that roughly
mark out sections of the columns in the Patrologia and
elsewhere), unattached to any structural unit,
viz., as <milestone unit="line" n="5"> or
<milestone unit="section" n="B">, etc.
Verse line numbering, on the other hand, is like any
other structural numbering: printed line numbers,
like printed chapter numbers or stanza numbers,
are captured as the @n attribute of the corresponding
structural unit (<l>, <lg>, <div>).
(6) We use <lb/> rarely and more or less as directly
equivalent to <br>. That is, to reflect a newline in the
source, and force one in the output, that is not associated
with any structural unit (like <l>, <p>, or <item>):
most commonly in the capture of lapidary inscriptions
that have been printed line-by-line in our source
<p>Here lies<lb/>Jacob Brown<lb/>Of this Parish.</p>
We would not use <p><lb/> for exactly the same reason
that we would not use <p><br> in HTML.
(7) We treat all the breaks as *events in the act of
reading* rather than as descriptions of the physical
layout per se, albeit events prompted by a physical fact.
Thus a <pb> is the act of turning the page. There are
many books in which the reader naturally turns to the
same page more than once: each 'page-turn' is represented
by a separate <pb/> in the text stream. E.g. a paragraph
on page 3 may contain a footnote (we embed the note at the point
to which it refers), and the footnote may be continued
on page 4, so the note contains a <pb n="4">, but at the
conclusion of the note the reader turns back to the remainder
of the paragraph on page 3, so there is another <pb n="3">
at the end of the note, and another <pb n="4"> as the
paragraph is continued onto page page 4. <cb/>, if we used
it, would behave the same, and <lb/> likewise, though only
in exceptional circumstances.
(8) The <lb/>s in the ECCO example from the OTA were
Sebastian's (?) translation of the TCP end-of-line-hyphen
character, which were not captured originally as <lb/> at
all, though they do (usually) represent a newline in the
source (occasionally they represent some other break
in the source, e.g. that occasioned by an intervening
floating graphic.) But that aside, yes, they represent
TCP practice, not OTA practice.
(9) I don't think we've tried to handle things like the
Eliot quotation, but we'd probably settle on leaving
the problematic slash as a literal
<p>April, as T.S. Eliot observed, is
<quote><l rend="inline">the cruellest month, breeding / </l>
<l rend="inline">Lilacs out of the dead land</l></quote></p>
The alternative would presumably be (since it is simply
a conventional way of rendering an <l>) to remove the
literal character and express it with <l @rend >
<p>April, as T.S. Eliot observed, is
<quote><l rend="inline">the cruellest month, breeding</l>
<l rend="inline initialVirgule">Lilacs out of the dead land</l>
Paul Schaffner | [log in to unmask] | http://www.umich.edu/~pfs/
316-C Hatcher Library N, Univ. of Michigan, Ann Arbor MI 48109-1190