Print

Print


Sent: Thursday, April 20, 2000 7:32 PM
Subject: XML at OSU
>Announcing a 5-day workshop associated with
>"Spoken Language in Context:
>Methods and Models"  July 3-7, 2000
>(see http://ling.ohio-state.edu/SU2000
>for further information)
>
>XML and Linguistic Annotation
>
>Chris Brew
>Department of Linguistics
>Ohio State University
>
>Corpora of spoken and written language are crucial to much of linguistics,
>providing both quantitive and qualitative data which informs and grounds
>our work. Much  of the material which is available is raw text, but this is
>complemented by a substantial and increasing number of annotated corpora.
>It is important to ensure that such annotated corpora are reliable,
>re-usable and maximally informative, but it is not immediately obvious how
>this is to be achieved, not least because the corpus data often stimulates
>research which was not envisaged at the time that the data was collected.
>
>XML(the eXtensible Markup Language) provides a standardized vehicle for the
>generation, processing and  exchange of arbitrary structured data,
>including, but not limited to, texts marked up with linguistic information.
>Many, but no means all, corpus creation initiatives have chosen to adopt
>the XML route. This means that researchers who want to use (and  perhaps
>add to) the products of these efforts need to understand something of what
>XML is and how it can be used. Non-linguistic applications of XML will be
>covered only tangentially.
>
>This workshop introduces XML as a means for creating and using linguistic
>annotations,  gives hands-on experience of both corpus annotation and
>corpus use, and  discusses its strengths and weaknesses as a research tool.
>There will be five 105 minute sessions, one per day, spread over a week,
>along with practical sessions covering the use of text and speech data.
>Students should expect to spend approximately 60  minutes per day on the
>practicals. The only prerequisite is a very basic training in any of the
>language sciences. It should therefore be accessible to all participants in
>"Spoken
>Language in Context: Methods and Models".
>