> I found lots of 'three dots' encoding
> German prose. They occur at the beginning of
> sentences, within sentences and in final position.
> I couldn't find them in the punctuation chapter
> of TEI P3.
> Suggestion: entity in the punctuation chapter with
> special regard to sentence initial, mid sentence and
> sentence final position ?
Firstly, do you need to do this at all? If you have a browser which can
condition searches on SGML structure, and if you mark up sentences, you
probably don't need a distinct entity for each of the three cases: the browser
should be able to distinguish them:
beginning: sentence-start followed by ellipses
middle: (anything but sentence-start) followed by ellipses followed by
(anything but sentence-end)
end: (ellipses followed by <sentence-end) or
(ellipses followed by (sentence-end puntuation) followed by sentence-end)
But, if you do decide explicitly to distinguish the three cases in the
electronic text, by all means go ahead and define such entities , but give
careful consideration to their expansion. A first cut at a TEI-ish way to
handle this situation might be
<gap desc='…' extent='sentence start ellipses' resp=transcriber>
<gap desc='…' extent='sentence middle ellipses' resp=transcriber>
<gap desc='…' extent='sentence end ellipses' resp=transcriber>
as the expansions for the three entities. Although purists can easily pick
holes in this idea:
-- it hides the ellipses from the copy text as attribute values, obscuring
fact that they appear in the copy text, and making it difficult for many
browsers to display them
-- desc is arguably misused in being used as a container for the ellipses
-- extent should be used to say how much material was omitted, not where
it has been omitted
A solution might be to use <del>:
<del type='sentence start'>…</del>
<del type='sentence middle>…</del>
<del type='sentence end>…</del>
but then you begin to wonder about the value of the resp attribute: who is
responsible for this deletion? I'd say the creator of the copy text.
Probably. But others' ideas may differ. What's more, if different hands can
be identified as being responsible for different ellipses, it's no longer
possible to use entities: you've got to insert the mark-up in full. And, if
you wanted to add a commentary on the omission, you'd need a tag with an ID to
tie it to. Again, you can't do this with an entity.
I'd say it's not the job of the TEI to define the three entities, firstly
because their expansion is dependent upon the aims of the transcriber, and
secondly because to define them might encourage their use in situations where
full-blown mark-up is really what is required. But particular projects should
not shrink from defining such entities if entities can fill their needs.
Does anybody else feel like setting more angels dancing on this particular pin