On Sat, 17 Mar 2007, Sylvain Loiseau wrote:
> In the context of a project involving wikipedia we are converting the
> whole (french) wikipedia into TEI format. This is done by modifying the
> mediawiki software so that it produces TEI rather than HTML as output. This
> modification allows to produce hight quality and finely tuned output. The
> structuration of the text with div is created. Numerous variables in the
> wiki text (such as categorization of the article) may be reported into the
> header. Structures (list, table, link, gallery) are converted.
> This conversion is not yet completed, but should reach completion in few
> weeks. The templates mechanism is still to be hacked. The main author for
> this conversion is Bernard Desgraupes. There is also an ODD document defining
> precisely the vocabulary used in this tei4mediawiki.
> But the mediawiki software produces some time not-well formed XHTML. We are
> not sure yet of the proportion of not well-formed TEI document produced.
Since posting my original query (to which Sylvain is responding), I have
spent some time looking at DekiWiki (http://opengarden.org/dekiwiki),
which is an open-source product built on MediaWiki. It differs from
MediaWiki in that it uses a WYSIWYG editor and the underlying data format
is XHTML from the start. (A user can toggle between WYSIWYG editing and
direct entry of HTML tags.)
This has one big advantage over standard MediaWiki syntax: you can create
a template file containing constructs like <div class="div1">, or add
pseudo-TEI tags like <span class="corr">, which can very easily be
converted to TEI via an XSLT transformation on the exported HTML.
David Sewell, Editorial and Technical Manager
Electronic Imprint, The University of Virginia Press
PO Box 400318, Charlottesville, VA 22904-4318 USA
Courier: 310 Old Ivy Way, Suite 302, Charlottesville VA 22903
Email: [log in to unmask] Tel: +1 434 924 9973