Print

Print


WORKSHOP ANNOUNCEMENT AND CALL FOR PAPERS
 
 
LINGUISTIC COREFERENCE WORKSHOP
26 May 1998, Morning Session
 
Held in conjunction with
The First International Conference on Language Resources and Evaluation
Granada, Spain  (28-30 May 1998)
 
 
WORKSHOP AIMS
 
It is essential, for a natural language processing system, to
instantiate each object, process, attribute, and property correctly, so
that all references to the same item be recognized as such and an
inventory of all distinct items be accurate at all times. This problem
is far from being resolved. There are both linguistic and computational
reasons for this deficiency. First, there is no satisfactory microtheory
of linguistic coreference. Secondly and consequently, there is no
satisfactory application of such a microtheory to NLP.
 
A microtheory of coreference in natural language includes in its scope
all the phenomena that satisfy the following condition: an
object/entity, an event, an attribute, a property or its value, an
attitude, or any combination of the above is referred to more than once
in a natural-language text, and the understanding of the text depends on
the correct interpretation of the two or more referring expressions as
designating the same object, event, etc.  A linguistic microtheory of
coreference for a language consists of the following elements:
     - a complete range of covered phenomena in the language;
     - a taxonomy of the range;
     - a typology of the range;
     - a list of rules forming the various types of coreference;
     - a list of rules interpreting the various types of coreference.
 
There has been a considerable amount of work on a few selected types of
coreference, focusing almost exclusively on object coreference. Thus,
significant work has been done in theoretical linguistics on anaphora
and cataphora, subsuming, for the large part, earlier work on deixis. A
small minority of authors have tried to extend their studies of anaphora
beyond mere syntax. In the cognitive-linguistics and
philosophy-of-language traditions, interesting work has been done
relating anaphora and deixis to ambiguity resolution and discourse
structure. At the same time, an effort in comparative-contrastive
linguistics has led some writers to examining the data of more than one
language at a time, still emphasizing entity or object reference.
 
In computational linguistics, the problem of coreference took early on the form
of pronoun antecedent resolution, and this particular task, somewhat broadened
to include a few other types of anaphora, still remains in the center of the
problem. The most sustained effort in the computational treatment of
coreference has been mounted within the Tipster/MUC-6 initiative. While
it has been recognized since quite early in the game that coreference
resolution is based in large part on world knowledge, most of the work
done on the matter computationally and theoretically ignores and avoids
world knowledge. The MUC-6 initiative makes such an orientation quite
explicit: the work should be based on such simpler resources as
part-of-speech tagging, simple noun phrase recognition, basic semantic
category information like, gender, number, and [to a limited extent]
full parse trees. Such an approach--trying to explore and maximize
everything that can be done simply and cheaply towards the resolution of
a complex program--is perfectly legitimate as long as it is realized
that a considerable part of the problem remains unsolved, and it is
indeed realized fully well within the MUC-6 initiative.
 
One persistent problem throughout the existing computational ventures
into coreference has been the lack of a consistent theoretical approach
to it. The result is that coreference phenomena are treated as
self-obvious, and most of them are overlooked, especially if they are
not explicit pronoun-antecedent or other equally evident anaphora cases.
What is needed for a full, accurate, and reliable approach to
coreference can be summarized, somewhat schematically, as involving the
following steps:
 
     1. understanding fully the range of the phenomenon and
     of the rules that govern it (theory);
     2. determining the extent of machine-tractable information
     in the rules;
     3. taking stock of all the rules that can be computed;
     4. developing the appropriate heuristics for the computable rules;
     5. computing the rules.
 
 
WORKSHOP AGENDA
 
The workshop will be held during the morning session of 26 May 1998 and
will include a joint address by the Organizing Committee (listed above),
followed by 5-8 individual presentations in two 90-120-minute blocks,
with a break provided midway through.
 
 
 
CALL FOR PAPERS
 
The Workshop solicits papers addressing any one or more of the points
addressed above as well as any other pertinent issues.
 
Papers based on a diversity of languages are encouraged, both one
language at a time and, especially, comparative/contrastive studies.
Also strongly encouraged are papers which extend the study of
coreference beyond entity/object reference, across document boundaries,
and/or into non-text media.
 
 
 
FORMAT FOR SUBMISSION
 
Paper submissions should consist of an extended abstract of
approximately 800 words, along with a brief description of the proposed
presentation structure (e.g., paper, paper plus demo,etc.).
 
Each submission should include a separate title page, providing the
following information: the title to be printed in the Conference
program; names and affiliations of all authors; the full address of the
primary author (or alternate contact person), including phone, fax,
email; and required audio-visual equipment.
 
Papers may be submitted by sending three hardcopies or one softcopy (in
TeX, ASCII, or post-script format) to the appropriate address as listed
below:
 
    Dr. Victor Raskin
    Chair, Interdepartmental Program in Linguistics
    Heavilon Hall
    Purdue University
    West Lafayette, IN   47907   USA
 
    [log in to unmask]
 
Submissions must be received no later than 1 March 1998 for a 15 March
notification of paper acceptance. (Full versions of all accepted papers are
requested no later than 15 April 1998 for inclusion in the conference
proceedings.)
 
 
WORKSHOP ORGANIZING COMMITTEE
 
Dr. Sara J. Shelton (Contact Person)
US Department of Defense
9800 Savage Road, R525
Ft Meade, MD  20755   USA
[log in to unmask]
301-688-0301 (voice)
301-688-0338 (fax)
 
Dr. Eduard Hovy
Information Sciences Institute
University of Southern California
4676 Admirality Way
Marina Del Rey, CA  90292-669   USA
[log in to unmask]
310-822-1511, ext. 731 (voice)
 
Dr. Victor Raskin
Interdepartmental Program in Linguistics
Heavilon Hall
Purdue University
West Lafayette, IN   47907   USA
[log in to unmask]
765-494-3782 (voice)
765-494-3780 (fax)