[RVDB]
>>> this seems to imply that the transcription has to occur
>>> at collection level. Since only full <TEI> texts can be validated,
there's
>>> no way to validate single poems (ie. <text>s)?
[MB]
>> At the level of encoding management,
>> there is no reason why a transcriber-encoder shouldn't work on a document
>> instance where the <group> has only one <text> member if that seems
>> desirable.
[RVDB]
>I'm afraid your hint isn't clear to me...
At the simplest level, I really meant only that a document instance like the
following is valid and can be edited with the full constraints of the DTD
applied by an editing application. Hence, if it is operationally convenient,
it's all an encoder needs to work on (and/or a validator module needs to
check). Merging of such documents (or rather of their innermost <text>
elements) into larger documents (with real teiHeader content etc) can be
handled elsewhere in the workflow. And extraction of such stub documents
from a canonical repository with multiple <texts> is also an easily
automated task. A very simple XSLT pass will do it: and it you had, say,
oXygen integrated with eXist (I know you are interested in the latter from
Another Place...) you could probably arrange for extraction and
re-integration of such stubs to be transparent to front-line encoders.
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE TEI.2 SYSTEM "http://www.tei-c.org/P4X/DTD/tei2.dtd" [
<!ENTITY % TEI.XML "INCLUDE" >
<!ENTITY % TEI.verse "INCLUDE" >
]>
<TEI.2>
<teiHeader>
<fileDesc><titleStmt><title></title></titleStmt>
<editionStmt><p></p></editionStmt>
<publicationStmt><p></p></publicationStmt>
<sourceDesc><p></p></sourceDesc>
</fileDesc>
</teiHeader>
<text>
<group>
<text>
<body>
<lg>
<l>A short poem</l>
</lg>
</body>
</text>
</group>
</text>
</TEI.2>
Of course, you could add any text-specific front and back material into that
inmost <text> if desired. I don't see what would be gained by adopting a
fairly radically new markup scheme (single poem in a collection = TEI
document) or in what sense(s) it would allow validation that couldn't
equally well be done on the above model.
But maybe I haven't understood what is behind your remark about what can be
validated.
> Well, some of the poems are unique; others occur in different versions. We
> intend to create one parallell-segmented collated version for each of the
> latter poems, thus 'unifying' different <text>s. A collection would then
> consist of the unique poems + the unified versions. In this respect, it
> seemed easier to think of the constituting parts as <TEI> documents. I
must
> confess, however, this is still at the conception phase, so I might be on
> wrong tracks.
These are familiar enough goals for an electronic critical edition -- which
doesn't mean that they, or their encoding, are trivial to achieve, of
course; but I don't know of anyone who has found that the associated tasks
were made easier by a "corpus" approach per se. My hunch is that going down
that route might bring a bunch of additional problems while taking you into
territory where you would be less able to draw on the prior experience of
others.
Michael Beddow
|