CLARIN has an online tool called weblicht[1] for linguistic annotation.
It uses pluggable modules to create annotation pipelines, e.g. for
lemmatisation, part of speech, named entity recognition, and syntax
parsing. It uses its own XML dialect, but it also has an TEI import
module.
Personally, I’d love to see more interesting analysis applications
beyond linguistic annotation as pluggable modules for this system. In
my project[2], we are developing a network analysis module on top of
that. Soon, I will be releasing a small helper library for python that
simplifies the development of weblicht modules. It will ship with a
sample module that exports into mallet format for topic modelling,
splitting the text at a given level of the hierarchy. But I think there
is much more possible using this approach.
At the other end, I also think that the TEI import needs some
refinements. For our own project, we are using a custom XQuery to
create a weblicht XML file from TEI, but it would be great to have an
available converter that supports various TEI features.
Frederik
[1]:
http://weblicht.sfs.uni-tuebingen.de/weblichtwiki/index.php/Main_Page
[2]: http://senereko.ceres.rub.de/
Am Do 27 Feb 2014 18:47:24 CET schrieb Martin Mueller:
> I think you can do that and more complex things with MorphAdorner,
> although it takes some familiarity with command line operations to take
> full advantage of it. For instance, MorphAdorner will produce its
> linguistic annotation as XML or tabular output. The tabular output is, if
> you will, a concordance on steroids. In addition to the standard KWIC
> output, every data row adds information about lemma, POS tag, and XPath.
>
>
> Martin Mueller
>
> Professor emeritus of English and Classics
> Northwestern University
>
>
>
>
> On 2/27/14 11:41 AM, "Martin Wynne" <[log in to unmask]> wrote:
>
>> On 27/02/14 13:55, Martin Mueller wrote:
>>> You can find the prototype at
>>> https://devadorner.northwestern.edu/corpussearch/pubsearch/
>>>
>>
>> That looks good, and so does ECCO-TCP in Philologic. But I was proposing
>> all of ECCO-TCP only as an example. An example of an online service that
>> only allows searching of pre-defined corpus is not what I was asking
>> for. What's the URL for a service to create a concordance of a
>> user-defined set of online TEI texts?
--
Frederik Elwert M.A.
Wissenschaftlicher Mitarbeiter
Projektkoordinator SeNeReKo
Centrum für Religionswissenschaftliche Studien
Ruhr-Universität Bochum
Universitätsstr. 150
D-44780 Bochum
Raum FNO 01/180
Tel. +49-(0)234 - 32 24794
|