Print

Print


 
                    *** PRIMARY CALL FOR PAPERS ****
 
ACL's SIGDAT and SIGNLL present the
 
   THIRD WORKSHOP ON VERY LARGE CORPORA
 
 
WHEN:    June 30, 1995 - immediately following ACL-95 (June 27-29)
WHERE:   MIT, Cambridge, Massachusetts, USA
 
 
WORKSHOP DESCRIPTION:
 
As in past years, the workshop will offer a general forum for new research in
corpus-based and statistical natural language processing.  Areas of interest
include (but are not limited to): sense disambiguation, part-of-speech tagging,
robust parsing, term and name identification, alignment of parallel text,
machine translation, lexicography, spelling correction, morphological analysis
and anaphora resolution.
 
This year, the workshop will be organized around the theme of:
 
        Supervised Training vs. Self-organizing Methods
 
Is annotation worth the effort?  Historically, annotated corpora have
made a significant contribution.  The tagged Brown Corpus, for example,
led to important improvements in part-of-speech tagging.  But annotated
corpora are expensive.  Very little annotated data is currently available,
especially for languages other than English.  Self-organizing methods offer
the hope that annotated corpora might not be necessary.  Do these methods
really work?  Do we have to choose between annotated corpora and
unannotated corpora?  Can we use both?
 
The workshop will encourage contributions of innovative research along this
spectrum.  In particular, it will seek work in languages and applications
where appropriately tagged training corpora do not currently exist.
It will also explore what new kinds of corpus annotations (such as discourse
structure, co-reference and sense tagging) would be useful to the community,
and will encourage papers on their development and use in experimental
projects.
 
The theme will provide an organizing structure to the workshop, and
offer a focus for debate.  However, we expect and will welcome a diverse
set of submissions in all areas of statistical and corpus-based NLP.
 
 
PROGRAM CHAIRS:
 
    Ken Church      - AT&T Bell Laboratories
    David Yarowsky  - University of Pennsylvania
 
 
SPONSORS:       LEXIS-NEXIS, Division of Reed and Elsevier, Plc.
 
            SIGDAT (ACL's special interest group for linguistic data
                         and corpus-based approaches to NLP)
 
         SIGNLL (ACL's special interest group for natural language learning)
 
 
FORMAT FOR SUBMISSION:   Authors should submit a full-length paper
(3500-8000 words), either electronically or in hard copy. Electronic
submissions should be mailed to "[log in to unmask]", and
must either be (a) plain ascii text, (b) a single postscript file, or
(c) a single latex file following the ACL-95 stylesheet (no separate
figures or .bib files). Hard copy submissions should be mailed to
Ken Church (address below), and should include four (4) copies of
the paper.
 
REQUIREMENTS: Papers should describe original work. A paper accepted
for presentation cannot be presented or have been presented at any
other meeting. Papers submitted to other conferences will be considered,
as long as this fact is clearly indicated in the submission.
 
 
SCHEDULE:
 
  Submission Deadline:    March 20, 1995
  Notification Date:      April 18, 1995
  Camera ready copy due:  May 11, 1995
 
CONTACT:
 
   Ken Church                           David Yarowsky
   Room 2B-421                          Dept. of Computer and Info. Science
   AT&T Bell Laboratories               University of Pennsylvania
   600 Mountain Ave.                    200 S. 33rd St.
   Murray Hill, NJ 07974  USA           Philadelphia, PA 19104-6389  USA
   e-mail: [log in to unmask]         email: [log in to unmask]