Print

Print


LREC is a huge European conference for people engaged in Natural
Language Processing or Language Engineering, or whatever it's called to
distinguish it from computational linguistics on the one hand and real
linguistics on the other. It has met every two years since 1996,  always
in a pleasant southern mediterranean location, and each time with a
bigger and more confusing programme. Nearly a thousand people turned up
for at least part of this year's. the fourth, which was held in the
Centro Cultural do Belem just outside Lisbon, over four days, with
additional pre- and post-conference satellite sessions. Each day of the
conference had a couple of plenaries, up to five parallel sessions, and
a huge poster session, interrupted only by the spectacle of several
hundred people trying to decide which of the dozens of pastellarias and
restaurantes in Bellem to patronise for a decent lunch.  The location
was a splendid modern building of palatial proportions,  full of
unexpected vistas over the the river Targus, and blessed by wireless
networking.

Some indication of the size of the conference is also given by the fact
that, although all contributions had been rigorously restricted to a
maximum of four A4 sides, the complete printed proceedings still
required  six telephone-directory-sized volumes, weighing 8.3 kg  (I
know this because  I  promptly entrusted mine to the Portuguese Postal
Service). Judging by the sessions I attended, a greater selectivity on
the part of the programme committee might have helped with this problem;
on the other hand, it would have reduced LREC's usefulness as an instant
barometer of what is actually going on in European NLP, and its immense
value as a talking shop and meeting point. University finances being
what they currently are, very few people would have the opportunity to
network in this way without the entrance ticket of a published
conference paper to justify their presence: the poster sessions are thus
the heart of this conference, as are the coffee breaks and the social
events.

In sessions I attended, I noticed that standoff markup and XML
annotation of various kinds (typically XCES or TEI) had established
themselves as the norm (well, I would, wouldn't I). Particularly notable
were a paper in which Nancy Ide was able to confirm at last that the
American National Corpus was now being distributed (I spent some time
talking with Keith Suderman, her programmer, about their use of W3C
Schema which confuses the current version of Xaira somewhat); a
presentation  about a grand new historical Corpus of German being
proposed by a consortium of 15 universities, which, if the DFG funds it,
will eventually become the world's biggest and most complex TEI
resource; a poster about a huge corpus of Italian newspaper text marked
up using an uneasy hybrid of TEI and sui-generis lexical annotation;
several papers on ways of handling multiple hierarchies with standoff or
other tricks; not a few papers on representation of linguistic
annotation using feature structures or similar formalisms.  Tomas
Erjavec gave a workmanlike presentation of the results from the TEI
Migration Working Group; I presented the results from the ISO/TEI
activity on feature structures; both were politely received. Elsewhere,
there were several papers and meetings relating to the so-called lexical
annotation framework activity within ISO; lots of interesting work on
text summarization and speech recognition  (including a rather scarey
paper about a huge  project being undertaken at UPenn -- guess what,
it's all about transcribing telephone conversation in Arabic); some
interesting new tools (I was particularly taken  by eXmerelda, which
appears to do proper "partitur" style arrangement of speech transcripts),

LREC was in many ways the brainchild of the late Antonio Zampolli; it is
therefore appropriate that this year's opening plenary included a
special session in his memory. Three of Antonio's closest friends and
colleagues -- Bernard Quemada, Makata Nagao, and Martin Kay -- gave
complementary tributes, each very different, and each quite moving in
its own way, though Quemada's was probably the finest. And the closing
plenary, given by Fred Jellinek, was an intriguing retrospective,
showing how far we have come since the days when what the NLP community
did was considered to be a strange branch of electrical  engineering,
rather than central to linguistics.

The weather was good too....