Print

Print


Bruno>While I was tagging transcriptions of spoken texts (radio journals) I
Bruno>encountered a phenomenon that, at first sight, does not seem to be
Bruno>covered by the TEI-guidelines.
 
Michael>How about <gap reason='inaudible'> ?
 
There's also
 
<unclear>
 
DESCRIPTION: contains a word, phrase, or passage which cannot be
transcribed with certainty because it is illegible or inaudible in the
source.
 
Arguably, <unclear> is syntactic sugar for a kind of <gap>, anyway.  But
see the Guidelines, section 18.2.4, and make your own choice.
 
Bruno>In the middle of the text you have three dots (...). These do not
Bruno>indicate pause or hesitation, but they show that the person who
Bruno>transcribed the material couldn't understand what was said.
 
Nick>I would use the entity &hellip; from the ISOpub entity set (also referred
Nick>to as "ISO 8879-1986//ENTITIES Publishing//EN") which represents the
Nick>character (three dots, one character, known as a 'horizontal ellipsis') you
Nick>are describing.  This character is commonly used to indicate an omission
Nick>that exists in text for whatever reason.
 
I can't go along with this.  The transcriber's conventions apparently
define ellipses as a form of mark-up.  In transducing the texts to a
TEI-conformant mark-up, one should transduce them into the prescribed TEI
form -- <gap> or <unclear> or whatever.
 
---
Dominic Dunlop