Although this is not strictly TEI matter, when reading this query on the LN
list I thought there may be a number of people on TEI-L who may have some
My name is Alain Matthey and I am a computer scientist. I am
working as a research assistant in the Laboratory of Speech and
Language Processing of the University of Neuchatel (Switzerland).
The members of this laboratory are working now on a research
project which consists to develop a kind of spell and grammar
checker like Grammatik, IBM's Critique, Mac Proof, Hugo or Sans
fautes but for French native speakers who write in English.
In this project, I will have to implemant the "preprocessing step" which
consists to recognise and delimit the sentences and the words of a text.
How to find and delimit automatically sentences and words in any kind of
ASCII texts? That's the problem!!!
So I am looking for some informations (bibliography,
papers, etc.) about "preprocessing of ASCII texts".
For any more informations or for an answer, please contact me at
the address above:
Laboratoire de traitement du langage et de la parole
UNIVERSITE DE NEUCHATEL
Avenue du Premier-Mars 26
Phone: 038 25 38 51 (int. 27)
Fax: 038 25 18 32
E-mail: [log in to unmask]
Thank you very much for your help!
P.S. It is not forbidden to write in French!!!