Dear Torsten,

Like I said in an earlier message, to me, the choice between (b) and (c) 
depends on some project-internal assumptions -- if for some reason it 
would be easier for your processing or visualisation tools to be 
presented with a continuous stream of <w>s and, at the same time, the 
usage of <choice> would be heavily restricted, then well, one could 
consider (c). Choice (b) requires adjustments to how you count <w>s or 
how you point at them for visualisation (and there's a few more hidden 
assumptions here).

As far as semantic equivalence between <w> and <choice> is concerned, 
then from the "lexical" point of view, there is none. From the 
"constructional" point of view, in those cases that you have selected, 
there might be equivalence on the bottom-up route, but from the top-down 
perspective (and as usual depending on some other assumptions), you 
might need an extra test for each occurrence of <choice> to see if it's 
really meant to stand for <w>, which altogether might negatively 
influence the processing time and complexity.

Yet another bunch of assumptions concerns the way in which the markup is 
to be constructed, and by who. It may sometimes be easier to have 
encoding guidelines in which every instance of a word is to be clad in 
<w> tags, and <choice> used for (... | ...) at whichever level.

HTH and best regards,


On 11/06/15 16:45, Torsten Schassan wrote:
> Dear Magdalena, dear Piotr,
> thanks for your answers and your thoughts.
> The reason why I put "(sometimes)" in the subject line was exactly that
> in those cases where <choice> contains just that single "word" I do see
> the semantic equivalence. Nonetheless, I think that <choice> doesn't
> target at sub-word level but rather at words or multiple words. Thus,
> I'm surprised that both of you consider to go for (c), I thought (b) to
> be more natural.
> To answer to your other questions:
>> - Will your <w>s carry IDs? And if so, for what purpose(s)?
> <w> might carry IDs but its main purpose is to deal with the <lb/>
> according to the needs of the reader: Either show line breaks and the
> separator or to suppress both. Another typical application in our
> digital library is to attach to it the coordinates of the word on the
> page in order to highlight a search result.
>> - Do you envision using <choice> for anything else than hyphenated words?
> Yes, we use it for all possible editorial pairs (abbr+expan, sic+corr,
> orig+reg), it is a relatively rare case that one of these comes with an
> additional line break. Thus we've got "a stack" of incidents that the
> XSLT has to take care of during publication.
> Best, Torsten
> Am 11.06.2015 um 16:13 schrieb Magdalena Turska:
>> Dear Torsten,
>> I think the third option is probably most typical situation - the
>> abbreviation is at the word level so you can wrap the <choice> in a <w>.
>> As to why you want to mark up words it's a whole different story. You said "in
>> our editions we usually wrap words (tokens) that go across lines in <w>,
>> e.g. <w>con=<lb/>silio</w>". Are they the only words you mark with <w>? If
>> so, why do they deserve this special treatment? I think only answering
>> these questions would allow to judge one way of encoding "better" than the
>> other.
>> Magdalena
>> On 11 June 2015 at 14:28, Torsten Schassan <[log in to unmask]> wrote:
>>> Dear all,
>>> in our editions we usually wrap words (tokens) that go across lines in
>>> <w>, e.g. <w>con=<lb/>silio</w>.
>>> Now, that word is abbreviated and that fact would be represented using
>>> choice/abbr+expan.
>>> Would you say <choice> works on the same level as <w> thus only one of
>>> them is needed, or not? Indeed, <w> is part of model.segLike while
>>> <choice> can contain larger portions of text thus belonging to
>>> model.linePart and model.pPart.editorial.
>>> Which encoding option would you consider be best?
>>> a: mutually exclusiveness
>>> either just <w>con=<lb/>silio</w>
>>> or
>>> <choice>
>>>    <abbr>co&#x0304;=<lb/>silio</abbr>
>>>    <expan>con=<lb/>silio</expan>
>>> </choice>
>>> b: <w> inside
>>> <choice>
>>>    <abbr><w>co&#x0304;=<lb/>silio</w></abbr>
>>>    <expan><w>con=<lb/>silio</w></expan>
>>> </choice>
>>> c: <w> outside
>>> <w>
>>>    <choice>
>>>      <abbr>co&#x0304;=<lb/>silio</abbr>
>>>      <expan>con=<lb/>silio</expan>
>>>    </choice>
>>> </w>
>>> Curious, best, Torsten
>>> --
>>> Torsten Schassan
>>> Digitale Editionen
>>> Abteilung Handschriften und Sondersammlungen
>>> Herzog August Bibliothek, Postfach 1364, D-38299 Wolfenbuettel
>>> Tel.: +49-5331-808-130 (Fax -165), schassan {at}
>>> Handschriftendatenbank: