Print

Print


Your mail was not delivered as follows:
%MAIL-E-SENDERR, error sending to user ELS009
-MAIL-E-OPENOUT, error opening DISK$EL:[ELS009]MAIL.MAI; as output
-SYSTEM-F-IVDEVNAM, invalid device name
%MAIL-E-SENDERR, error sending to user ELS009
-MAIL-E-OPENOUT, error opening DISK$EL:[ELS009]MAIL.MAI; as output
-SYSTEM-F-IVDEVNAM, invalid device name
 
Your original mail header and message follow.
 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Via: UK.AC.EARN-RELAY; Mon, 25 Nov 91  19:29 GMT
Received: from UKACRL by UK.AC.RL.IB (Mailer R2.07) with BSMTP id 5058; Mon, 25
          Nov 91 18:16:42 GMT
Received: from UICVM by UKACRL.BITNET (Mailer R2.07) with BSMTP id 0462; Mon,
          25 Nov 91 18:16:37 GM
Received: by UICVM (Mailer R2.07) id 0706; Mon, 25 Nov 91 12:14:06 CST
Date:     Mon, 25 Nov 1991 09:28:23 MST
Reply-To: Terry Langendoen <[log in to unmask]>
Sender:   "TEI-L: Text Encoding Initiative public discussion list" <[log in to unmask]
          UICVM>
From:     Terry Langendoen <[log in to unmask]>
Subject:  Alignment mechanisms
X-cc:     Wendy Plotkin <[log in to unmask]>
To:       "Thomas N. Corns" <[log in to unmask]>
 
<tei.1 id=am1>
<tei.header>
<file.description>
<title.statement>
<title>Alignment mechanisms
<author>D. Terence Langendoen
<!--The following six lines are not TEI-conformant!-->
<address>
<address.line>Department of Linguistics
<address.line>University of Arizona
<address.line>Tucson, AZ 85721 USA
<address.line>[log in to unmask]
</address>
</file.description>
<revision.history>
<change.note>
<who>DTL
<date>24 November 1991</date>
<what>first draft
</change.note>
<change.note>
<who>DTL
<date>25 November 1991</date>
<what>update, incorporating reference to <citn>Norway meeting
report</citn>.
</change.note>
</revision.history>
</tei.header>
<text>
<front>
<front.part>
<head>Request for comments
<p>Comments are welcome, but if you have any, send them quickly, to
the sender, and, if you think they are important enough, to the file
server or the TEI editors.
<body>
<head>Alignment mechanisms
<div1 n=1>
<head>Background
<div2 n=1>
<head>Explicit alignment in <citn>TEI P1</citn>
<p id=eap1>The mechanism for explicit alignment of different parts of
text with one another is described in detail in <citn id=c1>TEI P1,
section 6.2.5, pp. 142-144</citn>.  It consists of an
<tag>alignment</tag>, made up of pointers (in the form of
<tag>al.ptr</tag> and other elements), which point from the
<tag>alignment</tag> into the <tag>text</tag>.  However, it is possible
also to have pointers from the <tag>text</tag> into the
<tag>alignment</tag>, if <tag>al.ptr</tag> and the other elements in the
<tag>alignment</tag> are permitted to have their own <term>id</term>s.
This mechanism can be used to align not only the different linguistic
analyses of text portions (the only application described in that<xref
target=c1> section), but also parallel texts.
<div2 n=2>
<head>Proposal in <citn>TEI AI2 W1</citn> for <tag>timeline</tag>
<p>The Spoken Text Workgroup chaired by <name>Stig Johansson</name>
proposed that encodings of spoken texts be accompanied by a
<tag>timeline</tag>, consisting of pointers from the <tag>text</tag>
into the <tag>timeline</tag>.  One purpose for the <tag>timeline</tag>
is to align the elements of spoken text according to when they were
spoken, so that, for example, temporal overlapping of speakers can be
represented.
<div1 n=2>
<head>Discussion of alignment mechanisms at Myrdal meeting
<p>At the meeting of Working Group chairs and other TEI
representatives recently concluded at Myrdal, Norway, the similarity of
the <tag>alignment</tag> and <tag>timeline</tag> mechanisms was noted,
and I was charged with the task of reconciling the two mechanisms for
<citn>TEI P2</citn>.
<note place=foot>See <citn>Norway meeting report, posted on TEI-L,
24 November 1991, item 7.</note>
<trailer>
<p>In the next section<xref target=d3>, I outline my solution to
this problem.  As this<xref target=am1> is being written, my research
assistant <name>Steven Zepp</name> is writing the formal specifications
and validating them.  Once they are prepared and validated, they will be
circulated.
<div1 id=d3 n=3>
<head>Outline of proposal for <citn>TEI P2</citn>
<div2 n=1>
<head>Justification for distinct tagsets
<p>The two mechanisms are sufficiently different that they require
different tagsets.  An <tag>alignment</tag> as described in <citn>TEI
P1</citn> consists of one or more <tag>al.map</tag>s, which in turn must
consist of at least two <tag>al.ptr</tag>s or other pointing elements
(<tag>al.list</tag> and <tag>al.range</tag>).  The two or more pointers
that are grouped together within an <tag>al.map</tag> may be said to
<emph>correspond</emph> to one another.  On the other hand, a
<tag>timeline</tag> contains a set of <tag>point</tag>s, which as
described in <citn>TEI AI2 W1</citn> are unstructured,
but which can be thought of as consisting of zero or more pointing
elements, which are aligned with those points (and perhaps understood as
synchronous with those points).  The <tag>timeline</tag> moreover
requires attributes which the <tag>alignment</tag> does not; similarly
the <tag>point</tag>s require attributes which the <tag>al.map</tag>s do
not.
<div2 n=2>
<head>Restructuring of the tagset for analytic and textual
correspondence
<p>First, I suggest that certain tags in the tagset for textual and
analytic correspondence (what <citn>TEI P1</citn> called
<cited.word>explicit alignment</cited.word>) be renamed as follows.
<list type=ordered>
<enum>1. <item><tag>alignment</tag> becomes <tag>corresp.grp</tag>;
<enum>2. <item><tag>al.map</tag> becomes <tag>corresp</tag>;
<enum>3. <item><tag>al.ptr</tag> becomes <tag>xref</tag>;
<note place=foot>
<p>On the use of <tag>xref</tag> as a general purpose pointer, see
<citn>Norway meeting report, posted on TEI-L, 24 November 1991</citn>.
<p>I assume here that <tag>xref</tag> is characterized as in
<citn>TEI P1</citn>.  However, its definition may be expected to change
in ways which are not material here.</note>
<enum>4. <item><tag>al.list</tag> becomes <tag>xref.grp</tag>, and
should consist of two or more <tag>xref</tag>s, not one or more
<tag>al.ptr</tag>s, as in <citn>TEI P1</citn>.
</list>
The names <cited.word>corresp.grp</cited.word> and
<cited.word>corresp</cited.word> more accurately reflect the intended
semantics of their respective elements than do
<cited.word>alignment</cited.word> and <cited.word>al.map</cited.word>.
<p> Second, I propose to eliminate <tag>al.range</tag>, as its function
can be subsumed under <tag>xref</tag>, in virtue of the
<term>target.end</term> attribute on <tag>xref</tag>.
<div2 n=3>
<head>Restructuring of the tagset for synchronizing text
<p>First, I suggest that certain tags in the tagset proposed in
<citn>TEI AI2 W1, Spoken Texts</citn> for synchronizing text be
renamed as follows.
<list type=ordered>
<enum>1. <item><tag>timeline</tag> becomes <tag>align</tag>;
<enum>2. <item><tag>point</tag> becomes <tag>loc</tag>.
</list>
The reason for giving <tag>timeline</tag> a more neutral name is that it
can be used not only for alignment with time but also with any
one-dimensional structure associated with a text, such as lineation and
word position.  The reason for giving <tag>point</tag> a new name is to
dissociate it from the notion of a <term>pointer</term>; the name
<cited.word>loc</cited.word> indifferently represents
<gloss>locus</gloss> or <gloss>location</gloss>.
<p>Second, I suggest that <tag>align</tag> should have certain
attributes which specify whether it is a temporal or spatial (textual)
alignment; what the origin is, if any; whether the <tag>loc</tag>s are
understood to be a fixed distance apart; how the distance beween
<tag>loc</tag>s is measured; whether the distance to a particular
<tag>loc</tag> is being measured from the origin or from the immediately
preceding <tag>loc</tag>; etc.  Similarly <tag>loc</tag>s should have
attributes which indicate their value in the dimension represented by
<tag>align</tag> (at minimum, the <tag>loc</tag> identified as the
origin should be so specified); their distance from the previous
<tag>loc</tag> or the origin; etc.
<p>Third, I suggest that the content model for an <tag>align</tag>
be one or more <tag>loc</tag>s, like that of a <tag>corresp.grp</tag>,
which consists of one or more <tag>corresp</tag>s.  However, a
<tag>loc</tag> should consist of zero or more <tag>xref</tag>s, in
contrast to a <tag>corresp</tag>, which consists of two or more
<tag>xref</tag>s or <tag>xref.grp</tag>s.  The content model for
<tag>loc</tag> need not include <tag>xref.grp</tag>s.
<p>Finally, note that both <tag>corresp.grp</tag> and <tag>align</tag>
permit pointing from the text to the map and from the map to the text.
This bidirectionality is illustrated in the following<xref
target=d3.3.1> illustration of the use of <tag>align</tag>.
<div3 id=d3.3.1 n=1>
<head>Example of the use of a temporal <tag>align</tag>
<p>The following<xref target=ex1> example is adapted from <citn>TEI AI2
W1, section 8.5, Speaker overlap</citn>.
<xmp id=ex1><![ CDATA [
<text>
<u who=A><xref id=x1 target=p1>this <xref id=x2 target=p2>is <xref
id=x3 target=p3>my <xref id=x4 target=p4>turn<xref id=x5 target=p5></u>
<u who=B><xref id=x6 target=p2>balderdash<xref id=x7 target=p4></u>
<u who=C><xref id=x8 target=p3>no <xref id=x9 target=p4>it's mine<xref
id=x10 target=p5></u>
<kinesic who=B id=k1 start=p4 end=p5 desc="waves arms">
</text>
<align origin=p1 interval=1 measured.from=previous>
<loc id=p1 value=0>
<xref target=x1>
</loc>
<loc id=p2>
<xref target=x2>
<xref target=x6>
</loc>
<loc id=p3>
<xref target=x3>
<xref target=x8>
</loc>
<loc id=p4>
<xref target=x4>
<xref target=x7>
<xref target=x9>
<xref target=k1>
</loc>
<loc id=p5>
<xref target=x5>
<xref target=x10>
<xref target=k1>
</loc>
</align>
]]</xmp>
<div1 n=4>
<head>Possible extensions of <tag>align</tag> to multidimensional
alignment
<p>Each <tag>align</tag> represents a one-dimensional structure (i.e.,
sequence) of <tag>loc</tag>s.  It is easy to see how this concept can be
extended to two-, three- and even higher-dimensional structures, to
represent, for example the alignment of text on a page (two-dimensional
structure) or a book (three-dimensional structure in which the third
dimension is the page number).
</body>
</text>
</tei.1>
 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
End of returned mail