I am new to TEI markup but have recently been charged with encoding a
set of Oral Histories in P5.
Unfortunately, we are not encoding the Oral Histories directly from
their original source material, but we will instead be encoding them
from transcriptions that were done some time ago and were recorded in MS
Word. These transcriptions are somewhat "cleaned-up" versions of the
actual audio that's recorded on the tapes themselves; for example,
paragraph breaks have been applied to any commentary of substantial
Of course, we plan to include information about the documents involved
in our process in the TEI header, but I still have a few questions:
1) Has anyone else encountered a project like this, and if so, what
level of encoding did you attempt to provide?
2) Should we decide to retain the paragraph separations for the sake of
readability, how would that information be best encoded? Would it be
advisable to break up the longer commentaries into multiple utterance
tags, or to put a "pause" tag between those paragraph breaks, etc.???