CHRISTINE Corpus, Stage I
Stage I of the CHRISTINE Corpus is now available. It
comprises a structurally-annotated cross-section of
spontaneous 1990s speech drawn from all UK regions,
social classes, etc. The annotation scheme is that of the
well-established SUSANNE Corpus, and is defined in detail in G.R. Sampson,
_English for the Computer_, Clarendon Press
(Oxford University Press), 1995.
CHRISTINE/I is described in detail in its Documentation
file, which is available on the Web at
(250 kb, about 35,000 words). Another Web page,
discusses the background and aims of the CHRISTINE Project.
The Corpus can be downloaded by anonymous ftp. The URL is
-- use "uncompress" to uncompress the file, and then
"tar -xf" to unpack the tar file into its 84 component files
(which include a copy of the Documentation file).
CHRISTINE/I includes about 40% of the eventual complete CHRISTINE
Corpus. The complete Corpus is expected to be ready for distribution
early in the year 2000.
Prof. Geoffrey Sampson
School of Cognitive & Computing Sciences
University of Sussex
Falmer, Brighton BN1 9QH, GB
e-mail [log in to unmask]
tel. +44 1273 678525
fax +44 1273 671320
Web site http://www.grs.u-net.com