as we currently revise our local customisation of P5 to live up to the
current release of the guidelines, unresolved questions keep popping up.
I hope some of them are interesting for some of you.
Now to the issue: Some parts of transcribed audio files may be too noisy
to be properly transcribed. We currently use <gap>s for this purpose, as
the sample value 'inaudible' for <gap>'s @reason indicates that it might
be an appropriate choice. What we are currently not sure of is how to
indicate the length of the missing stretch. We think of either abusing
@extent with a time-value and therefore changing the datatype of extent,
such that to something one would usually find in a @dur attribute, or
add the <gap> to 'att.duration' or 'att.duration.w3c'. Again, this might
be something someone could expect anyway, when using the module spoken.
A brief illustration:
<gap reason="inaudible" extent="PT01M23S" unit="duration"/>
<gap reason="inaudible" dur="PT01M23S"/>
What do you think?
| Stefan Majewski | Department of English, University of Vienna |
| VOICE Corpus | Spitalgasse 2-4, Universitätscampus AAKH, Hof 8 |
| | A-1090 Vienna |
| Research Ass.(IT)| Phone: +43 1 4277 424 46 |