For ease in processing, I sometimes encode lineation early, using a custom
schema with <line> . . . </line> tags, and then transform that with XSLT
to TEI in order to introduce markup that might span line breaks. Having
real start and end tags for lines during initial transcription is, in my
experience, quicker and more accurate than entering milestone <lb/> tags
directly, and <oXygen/> pretty-prints the text with each <line> on its own
line, which makes it easy to see, for example, that the lines that are
twice as long as they should be are the ones where I accidentally tagged
two lines as one <line>. I can't see that as easily with the <lb/>
notation. Once I've tagged the lines correctly, though, converting to TEI
with empty <lb/> elements isn't a problem.
What's a bit trickier is that sometimes there's a word broken across lines
and sometimes the line break falls between words, and in my convention the
former has no spaces around the <lb/> and the latter has spaces. If you
encode those differently (hyphen at the end of the line, attribute on the
<lb/>, space or no space before or after the <lb/>, of whatever you find
convenient), you can easily treat the situations differently when you
convert to TEI
On 2/9/14, 5:52 PM, "Sebastian Rahtz" <[log in to unmask]> wrote:
>On 9 Feb 2014, at 22:08, Scott Derrick <[log in to unmask]> wrote:
>> And that is a simple example. With books, parts, chapters, sections,
>>line groups, paragraphs, sentences, etc... The indent level to make it
>>readable to humans comes at a great cost, just because of the pesky
>if you apply the indenter in e.g. oxygen, it does exactly the right thing
><lb/>When 'Omer smote 'is bloomin' lyre,<lb/>He'd 'eard men sing by and
>an' sea;<lb/>An' what he thought 'e might require,<lb/>'E went an' took
>-- the same as me!
>and doesn¹t introduce spurious white space. it looks perfectly readable
>to my (admittedly odd) eyes.
>> The evidence of how broken TEI is in regard to white space is the
>>plethora of "fixes" and kludgy code to deal with it.
>I have to dispute this. I don¹t think the TEI is broken at all as regards
>white space. You may argue that
>_XML_ is complex and counterintuitive sometimes, but the ways of
>processing it are quite clear
>and shouldn¹t be classified as ³fixes². It¹s just that you have to write
>processing code carefully.
>> We have a <lb/>(line break) tag why not a <le/>(line end) tag?
>well, it¹s in the name - <lb> says ³line break², not ³line start² .
>> Though I do think using paired, <lb n="x"/>line of text goes here<lb
>>n="x" type="end"/> may solve the problem. And reduce the kludge factor
>>in the white space handlers. in
>then you break the semantics of <lb/> which mean ³here is where a line
>breaks². you¹re implying there are two breaks,
>when there is only one.
>if you mean you actually want
> <sl/>When 'Omer smote 'is bloomin' lyre,<el/>
> <sl/>He'd 'eard men sing by and an' sea;<el/>
>then you can already do this with <milestone> and <anchor> or the like.
>Director (Research) of Academic IT
>University of Oxford IT Services
>13 Banbury Road, Oxford OX2 6NN. Phone +44 1865 283431
>Não sou nada.
>Nunca serei nada.
>Não posso querer ser nada.
>À parte isso, tenho em mim todos os sonhos do mundo.