This started as a response to Karen Desmond's question but quickly
morphed (as was more or less inevitable) into a reaction to some of the
big questions circulating:
On 8/23/2011 9:44 AM, Karen wrote:
> I am completing a very small project using TEI. An edition of a short medieval
> technical text that appears in various states of completion and variation in
> several medieval manuscripts. And I am frustrated at the workarounds I have
> to do (such as how to represent lines of manuscript - or page breaks - or
> lacunae - that overlap with document elements -such as paragraphs or divs -
> or even readings in the various sources).
> I'm also concerned about the level of markup I am including with my
> transcriptions and prefer the method of making the best diplomatic
> transcriptions I can of each text and then running it through an automatic
> collation software like Juxta. Ideally this would be how I would best like to
> present this edition - and then include annotations that would overlay this -
> with my critical commentary.
> Will TEI be moving in this direction?
I can't possibly answer the last question, but I can reiterate (as I
often have on this list and elsewhere) that the problems Karen indicates
in applying TEI markup are profound ones, raising questions at the very
heart of TEI, in particular with its specification as an application
(tag set) of XML. They are interesting questions, important ones, and
the subject of ongoing research in several places.
Until practical solutions to these problems emerge, however, she
basically has a choice of whether to engage directly in this R&D herself
(at which point she turns from research into Medieval manuscripts to
research in computational data modeling), or whether to take a
wait-patiently approach (in the confidence that her frustration with the
status quo is shared). The choice "use what works" just isn't available
yet. When it is, I am confident that the TEI community will find ways of
> I note in the presentation given at the summer school in Oxford on digital
> editing gives the reference to Multi-version documents
> Having read up on this last night this would seem much more like I would like to
> do - but is such an approach likely to become mainstream?
Candidly, may I ask why it matters whether the approach will be mainstream?
The reason I ask is not just to play devil's advocate (because I have a
place in my heart for boutique projects that achieve interesting and
illuminating results without using mainstream approaches) -- or even to
elicit an answer, so much as pose a problem. Whatever your answer is
(and I can imagine several I think are legitimate), it plays straight
into some of the questions regarding the TEI's strategic direction that
are currently vexing readers of this list.
Let me skip to the chase: if readers think problems of data interchange
across systems all nominally conformant to a "standard" such as TEI is a
problem now, then just wait till we have data models that can
accommodate overlap gracefully.
Of course, this is both a problem and an opportunity, depending on how
you like it. Basically I think it reveals that we want the TEI for two
very different reasons. 1. TEI gives us a wagon, and if it's a
reasonably good one, we don't have to reinvent wheels, but can just get
on with the work of rolling. 2. TEI gives us wheels, and axles and axle
grease, and we can see TEI wagons at the annual TEI wagon show; but you
don't get very far with the TEI without getting your hands dirty,
because it isn't that kind of technology, and your terrain is rough. (Or
cold, or swampy, or dry.)
TEI seeks, properly (I think), to accommodate both sets of needs.
However, the stress will remain. Say you have a fairly versatile but
easy-to-learn, lightweight standard for plain vanilla markup, supported
by tools that make it simple to do simple things. It gives you the first
80% of what you need. Then you learn that to get the remaining 20%
demands a level of engagement with the technology that you had not
planned on. Yet, you amaze yourself by picking up some CSS and some
XSLT, and make headway. You didn't plan on becoming a markup expert as
well as a Medievalist, but it's okay.
And then you discover that your texts, once you've worked into this last
20%, are no longer fully and blindly "interchangeable". Is this a
failure? For you, or for the TEI?
Unlike some contributors to this list, I don't think TEI has failed if
it can't guarantee blind interchange. I do imagine some things could be
done to make that first 80% easier, more solid and (yes) more
interchangeable; that this is a worthy development goal; and that it
could dovetail with other needs the TEI faces to provide value to its
users and constituencies. Plus I think there is a need for this
development that is felt considerably beyond the TEI.
However I also feel there is also considerable value in the support TEI
gives to users who need to push beyond that 80% (and who doesn't?).
Doing this means, to my mind, accepting that beyond this 80% (which will
be 95% for some, 60% for others) and into the really interesting harder
problems, we aren't going to get blind interchange for free.
Over the long term, is it possible for TEI to provide help to both sides
of this stress, and (I would argue) foster their complementarity? In
particular, does everyone's or even anyone's last 20% have to make it
into the Guidelines, thereby compromising the integrity of the 80% for
everyone? (Keep in mind that the 20%, when everyone's 20% is added
together, is a lot bigger than the 80%.)
If the answer to the last question is "yes, it's just too tempting to
commend innovative solutions with the final imprimatur of the TEI
Guidelines", then I fear what awaits TEI once data models to support
overlap become practical. (That is, while I have no idea what it will
be, I think I'm scared of it. :-)
My own feeling is that developers who design, implement and publicly
document good solutions in the 20% should be recognized, praised and
published -- but that extending the Guidelines should not be regarded as
an appropriate way to do so (for various reasons including the impact
that has on interchange).
Maybe the TEI could offer namespaces instead, providing recognition and
support to qualifying authors of new modules, without bloating the core
tag set further?
And if a lean-and-mean core tag set in the 80% could be published in a
namespace distinct from that of TEI-all, just so stylesheet developers
would know what was (not) in scope when handling it, that would be a
good thing too.
Mind you, I make these suggestions not because I am a big fan of XML
namespaces. (And yes, namespace proliferation can be a headache too.) I
just wonder if they may fit the problem here ... to say nothing of the
problems of tomorrow....
Wendell Piez mailto:[log in to unmask]
Mulberry Technologies, Inc. http://www.mulberrytech.com
17 West Jefferson Street Direct Phone: 301/315-9635
Suite 207 Phone: 301/315-9631
Rockville, MD 20850 Fax: 301/315-8285
Mulberry Technologies: A Consultancy Specializing in SGML and XML