Some thoughts on the future of the TEI, proceeding, I hope, from least
to most controversial.
1. Marketing the TEI
1a. Is the TEI about "interchange?"
The term "interchange" gets bandied about as a benefit of use of TEI
markup. While it is true that TEI texts are "interchangeable" in the
sense that anyone with an XML processor can parse them, is that really
"interchange" in the sense understood by many users?
Let me use an example from some of my current work: ODF. Users are
unpleasantly surprised that texts can indeed be loaded in a variety of
applications (no names please!) and the displayed results are different.
Now, ODF folks will say, but the texts are "interoperable," even if they
That is *not* how users think about "interoperability" and their
expectations are as valid as anyone's.
I suspect that the expectation for "interchange" is quite similar. That
is that some other user encoded their text the same way I would. Not
very likely. Use of "foreign" texts is always going to take work, more
in some cases than others. But in all cases, the sense of disappointment
is going to be present.
XML markup, even if you and I use the same markup, carries implied
semantics that are not represented in the markup. So even in the
unlikely event that we use the same markup, you are likely to be
TEI markup is very useful as a default markup, but only as a default.
And one that carries (as do all other markup languages) implicit semantics.
Rather than setting users up for the disappointment of expecting to
easily benefit from the texts of others or to create texts that are
going to be snapped up by other scholars, let's be realistic about the
difficulties of interchange, under the best of circumstances and not
make it a selling point for the TEI.
1b. Is the TEI about text analysis?
How many books can you name about text analysis for the development or
application of markup? (Leaving aside Eve Maler's Developing SGML DTDs:
From Text to Model to Markup)
Think about the TEI Guidelines if they were written about text genres,
perhaps keeping an "common" text features section/volume, but that used
documents as encountered by text scholars as examples for the
application of markup.
Moving from the familiar (the texts) to the unfamiliar (markup).
It would be good to recall that the TEI Consortium was originally
populated by people who wanted to do things with texts, markup was just
a means to that end.
True, there would always be the schema bits, content models,
transformation stylesheets, and that is of interest to myself and others
but marketing the TEI needs to reach out to a much broader audience.
Imagine picking some of the older material being imaged by Google and
preparing a TEI encoding of only the first chapter as an introduction,
showing what can be done if the entire text were encoded. I would choose
one that would be controversial today. Or several. Twain or others for
example. I don't think a lot of people would stay with it but what
matters is that they register, $$$, or buy the hard copy book.
There are things that can be done with statistical analysis of large
amounts of text but those aren't the same things that can be
accomplished by markup. We just have to make the best case for what
Other posts to follow.
Hope everyone is having a great day!
[log in to unmask]
Chair, V1 - US TAG to JTC 1/SC 34
Convener, JTC 1/SC 34/WG 3 (Topic Maps)
Editor, OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300
Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps)
Another Word For It (blog): http://tm.durusau.net