Dear TEI-L Readers,
While encoding fine grained annotated transcriptions I stumbled over
the TEI use of the <unclear> element.
As you know, this element is declared in a way that makes it accessible
for encoding either written or spoken matters. This generality feature
seems to make some in (my opinion) necessary nestings of spoken elements
impossible.
a) General and unproblematic Use of <unclear> in spoken matters:
<u who=Cwn>
<unclear>Excuse me&lf;</unclear><pause>
You dont have some aesthetic&trunc;
<pause><unclear>specially on early</unclear>
aesthetics terminology &lr;</u>
b) Encoding schemas of <u> in our transcriptions (simplified)::
<u who=AA><w>You</> <w>don't</> <pause> <w>have</w> <vocal>
<w>some</> ... </u>
c) Desired Encoding now inserting <unclear>:
<u who=AA><w>You</> <unclear><w>don't</> <pause> <w>have</> <vocal>
<w>some</> </unclear>... </u>
The problem occuring using <unclear>...</> is that the TEI DTD does not
allow specific spoken elements to be nested inside it, so <pause> and
<vocal> are not allowed there.
A very cumbersome way around this problem would be to interrupt
<unclear> sequences where these elements appear:
c2)
<u who=AA><w>You</> <unclear><w>don't</> </unclear><pause> <unclear>
<w>have</> </unclear>
<vocal> <unclear><w>some</> </unclear>... </u>
Apart from the fact that this is very heavy for reading and cumbersome
for database retrieval, it also makes the automatic SGML generation
program (in our case) more difficult to program.
Does have anyone a suggestion for a better encoding of this problem? Is
it the intention of the TEI to avoid this kind of nesting?
Franck Bodmer
--------------------------------------------------
Department for Linguistic Data Processing
Institut fuer deutsche Sprache
Postfach 10 16 21
D-68016 Mannheim (Germany)
tel.: 49/+621-1581-271
fax : 49/+621-1581-200
mail: [log in to unmask]
http://www.ids-mannheim.de
-------------------------------------------------
|