I have a version/revision control question for the list. I'd appreciate
hearing any techniques others might have developed. What I am looking
for is a way to reduce duplication and ensure proper revision control
for forms found in more than one TEI encoded file.
The Electronic Caedmon's Hymn is based on three types of TEI encoded
1) manuscript transcriptions;
2) variant reading lists (apparatus);
3) critical texts.
The origins of all readings from the poem in all document types is
docType1 (the manuscript transcriptions). These contain the content of
the poem in a series of extremely richly encoded < w > (i.e. word) tags.
Words are then transmitted to the other document types for use in
different contexts according to the following scheme (the id, copyOf and
sameAs references show how idrefs are handled in the tranlations; they
are explained below):
document type 1 ---verbatim copy----> document type 2
id=x id=y copyOf=x
document type 1 ---simplified copy--> document type 3
id=x id=z sameAs=x
As this diagram shows, copies of individual elements always carry some
form of the IDREF in the original instance with them when they are
copied into a new document (there are no duplicated IDREFs in the entire
project). The ID reappears as a copyOf ATTREF if the element is a
verbatim copy, or a sameAs if it has been simplified but not
substantively changed (this seems like a stretch of the TEI guidelines,
I may be defining new attribute names for the element later). Because
of this I am always able both to trace the origin of readings in
doctypes2 and 3, and know something about how close a relationship it
has to the original form.
I am now doing a final proof-reading of the entire project, I want to
ensure that changes to individual elements in docType1 are kept
up-to-date in other documents where verbatim copies are found (sameAs
copies are slightly less important since they need to be edited by hand
anyway). I could of course cut and paste everything every time I make a
change in docType1, but this seems to me likely to add as many errors as
it replaces (the coding is complicated in docType2 as well). I am as a
result currently considering what I think is a radical solution:
1) proof-reading the entire set of Type1 documents before making changes
2) copying the entire contents of Type1 documents to my project entity
file and redefining each element as an entity ref (i.e. < w id="x"
>CONTENT</ w> to < !ENTITY copyX '< w copyOf="x" >CONTENT</ w>' >
3) reducing all copied elements in doctype 2 to an id-based entity name
(i.e. < w id=y copyOf=x >CONTENT</ w> to & copyX ;)
4) using the entities to keep the readings in doctype 2 current
I've done a trial run, and the automation seems to work o.k. But this
seems totally against the grain of the TEI--I'm sure it can't be the
right or best way of doing this. Does anybody have a solution that
keeps the relationship explicit within the TEI? For time reasons, I'd
prefer solutions that work as is with current SGML or XML browsers.