Print

Print


I don't see a conflict between <w> and <choice>, and I think the "choice"
(HA!) Is really between whether you are trying to encode that there is a
word and that word is inscribed one of two ways, or that there is a choice
between two words with the same meaning and different inscriptions.  If
it's the former, if say put the choice inside the word, if the latter, put
the word(s) inside the choice.
On Jun 11, 2015 5:50 PM, "Piotr BaƄski" <[log in to unmask]> wrote:

> Dear Torsten,
>
> Like I said in an earlier message, to me, the choice between (b) and (c)
> depends on some project-internal assumptions -- if for some reason it would
> be easier for your processing or visualisation tools to be presented with a
> continuous stream of <w>s and, at the same time, the usage of <choice>
> would be heavily restricted, then well, one could consider (c). Choice (b)
> requires adjustments to how you count <w>s or how you point at them for
> visualisation (and there's a few more hidden assumptions here).
>
> As far as semantic equivalence between <w> and <choice> is concerned, then
> from the "lexical" point of view, there is none. From the "constructional"
> point of view, in those cases that you have selected, there might be
> equivalence on the bottom-up route, but from the top-down perspective (and
> as usual depending on some other assumptions), you might need an extra test
> for each occurrence of <choice> to see if it's really meant to stand for
> <w>, which altogether might negatively influence the processing time and
> complexity.
>
> Yet another bunch of assumptions concerns the way in which the markup is
> to be constructed, and by who. It may sometimes be easier to have encoding
> guidelines in which every instance of a word is to be clad in <w> tags, and
> <choice> used for (... | ...) at whichever level.
>
> HTH and best regards,
>
>   Piotr
>
>
> On 11/06/15 16:45, Torsten Schassan wrote:
>
>> Dear Magdalena, dear Piotr,
>>
>> thanks for your answers and your thoughts.
>>
>> The reason why I put "(sometimes)" in the subject line was exactly that
>> in those cases where <choice> contains just that single "word" I do see
>> the semantic equivalence. Nonetheless, I think that <choice> doesn't
>> target at sub-word level but rather at words or multiple words. Thus,
>> I'm surprised that both of you consider to go for (c), I thought (b) to
>> be more natural.
>>
>> To answer to your other questions:
>>
>>  - Will your <w>s carry IDs? And if so, for what purpose(s)?
>>>
>>
>> <w> might carry IDs but its main purpose is to deal with the <lb/>
>> according to the needs of the reader: Either show line breaks and the
>> separator or to suppress both. Another typical application in our
>> digital library is to attach to it the coordinates of the word on the
>> page in order to highlight a search result.
>>
>>  - Do you envision using <choice> for anything else than hyphenated words?
>>>
>>
>> Yes, we use it for all possible editorial pairs (abbr+expan, sic+corr,
>> orig+reg), it is a relatively rare case that one of these comes with an
>> additional line break. Thus we've got "a stack" of incidents that the
>> XSLT has to take care of during publication.
>>
>> Best, Torsten
>>
>>
>> Am 11.06.2015 um 16:13 schrieb Magdalena Turska:
>>
>>> Dear Torsten,
>>>
>>> I think the third option is probably most typical situation - the
>>> abbreviation is at the word level so you can wrap the <choice> in a <w>.
>>>
>>> As to why you want to mark up words it's a whole different story. You
>>> said "in
>>> our editions we usually wrap words (tokens) that go across lines in <w>,
>>> e.g. <w>con=<lb/>silio</w>". Are they the only words you mark with <w>?
>>> If
>>> so, why do they deserve this special treatment? I think only answering
>>> these questions would allow to judge one way of encoding "better" than
>>> the
>>> other.
>>>
>>> Magdalena
>>>
>>> On 11 June 2015 at 14:28, Torsten Schassan <[log in to unmask]> wrote:
>>>
>>>  Dear all,
>>>>
>>>> in our editions we usually wrap words (tokens) that go across lines in
>>>> <w>, e.g. <w>con=<lb/>silio</w>.
>>>>
>>>> Now, that word is abbreviated and that fact would be represented using
>>>> choice/abbr+expan.
>>>>
>>>> Would you say <choice> works on the same level as <w> thus only one of
>>>> them is needed, or not? Indeed, <w> is part of model.segLike while
>>>> <choice> can contain larger portions of text thus belonging to
>>>> model.linePart and model.pPart.editorial.
>>>>
>>>> Which encoding option would you consider be best?
>>>>
>>>> a: mutually exclusiveness
>>>> either just <w>con=<lb/>silio</w>
>>>> or
>>>> <choice>
>>>>    <abbr>co&#x0304;=<lb/>silio</abbr>
>>>>    <expan>con=<lb/>silio</expan>
>>>> </choice>
>>>>
>>>> b: <w> inside
>>>> <choice>
>>>>    <abbr><w>co&#x0304;=<lb/>silio</w></abbr>
>>>>    <expan><w>con=<lb/>silio</w></expan>
>>>> </choice>
>>>>
>>>> c: <w> outside
>>>> <w>
>>>>    <choice>
>>>>      <abbr>co&#x0304;=<lb/>silio</abbr>
>>>>      <expan>con=<lb/>silio</expan>
>>>>    </choice>
>>>> </w>
>>>>
>>>>
>>>> Curious, best, Torsten
>>>>
>>>> --
>>>> Torsten Schassan
>>>> Digitale Editionen
>>>> Abteilung Handschriften und Sondersammlungen
>>>> Herzog August Bibliothek, Postfach 1364, D-38299 Wolfenbuettel
>>>> Tel.: +49-5331-808-130 (Fax -165), schassan {at} hab.de
>>>>
>>>> Handschriftendatenbank: http://diglib.hab.de/?db=mss
>>>>
>>>>
>>>
>>
>>