I have recently (re)joined this listserv in an effort to learn more about
TEI. In one short week I have been able to pick up some very useful
information, having downloaded the trial version of oXygen 9 (our team has a
licence already for an earlier version) and becoming aware thanks to
Martin's message of the TEIwiki and the project of putting together a TEI
tutorial with example texts.
Though I have read through the documentation on TEI4 and have looked through
some of the more recent TEI5 official recommendations, I am really at the
very beginning stages of learning TEI (and XML for that matter). Our team
is preparing a (relatively) large corpus of linguistic texts that we have
scanned and conducted OCR on. Our workflow starts in Word, rather than in
text format. Using paragraph and character styles (as well as some macros
for search and replace) we can add a certain amount of metadata (<hi>,
<div1>, <div2>, <pb>, <p>, <name>, <cit>, <quote>, <bib>, etc.) without the
tags cluttering up the document and then export the result using a
conversion macro. As soon as we need nested tags the limitations of Word
begin to show up.
However, it seems to me important to have an idea of where we are going in
the long run even at this stage. One of the things that we would very much
like to mark is "examples", that is to say "examples" for which there is no
reference. Often linguists cite literary sources, in which case it seems to
me that we have an interest in using <cit>. But when the example is created
or overheard, I'm not sure what to do, as of the three tags I've found:
<eg>, <exemplum>, and <mentioned>, only the latter two seem to allow "type"
attributes and unless I've misunderstood the definition of <exemplum>
appears that only <mention> can accept child elements such as <emph> or
Does anyone have any suggestions regarding the coding of examples, such that
they can be distinguished from items which are simply "mentioned"?
<mentioned> works well enough when it is a single word that is being
discussed, yet I wonder if I wouldn't be abusing the tag if I were to
introduce a whole host of types:
Semantically <eg> seems more appropriate for sentences and phrases, insofar
as they are not really simply "mentioned" in the
text in the same way as a single word or verb form might be.
<mention type="set">that, those</mention> similarly ; cf.
<mention type="sentence">his house is bigger than <emph>that</emph> of his
Any direction you could send me in for more information on this matter would
be much appreciated (I would be interested in learning about forums
associated with or indexed versions of this listserv. I apologize if this
topic has already been treated.
Thank you already for the rich information being provided in your
communication, even if, I have to admit that a good deal of it is going well
above my head.
PhD candidate, Ecole normale sup -- lettres et sciences humaines, Lyon