I am currently building a corpus of speech involving simultaneous or liason
interpreting between English and Italian.
I have had some problems with truncations and the lengthening of sounds. As
far as I know, using the Guidelines I can encode them in two different,
- I could use entity references (&trunc; - &long;)
- I could use the DEL element for word truncations and the REG element for
the lengthening of sounds. This would have the advantage of making the
standard forms recoverable (where they can be guessed).
Do you know any other way of doing it? Or do you know any other work I
could refer to which had the same problem?
And in addition, do you know any work I could refer to which used some XSL
stylesheet to align turn overlaps for display?
Thank you very much.
SSLMIT - University of Bologna
C.so della Repubblica, 156