Ron Van Den Branden wrote:
> I am considering to going even further:
> * treat every poem as <TEI> document
> * treat every bundle as a <teiCorpus> element, including the
> poems with entity references
In this sort of case, I would tend to choose between encoding as a
tei.Corpus with component documents on the one hand and a single document
with <group>ed <text>s on the other on the basis of metadata requirements.
The big difference between the corpus and the group approach is that the
latter allows only one teiHeader, whereas a corpus not only has its own
teiHeader on the container, but each constituent document has a teiHeader of
its own. How much that matters is very much dependent on what the metadata
needs of the project in question are. Plainly, large corpora of generically,
historically, lunguistically and/or structurally diverse texts which
tei.Corpus is primarily intended to encode will benefit from separate
teiHeaders. The less diverse and divergent the metadata requirements of a
collection are, the less likely they are to need multiple teiHeaders.
And as you yourself mention, it is possible to use entity inclusion (or
XIncludes) to assemble <group>ed <text>s. Inclusion via such techniques
isn't confined to corpus encoding.
On the more specific issues
> This strategy is informed by the desire to assign every poem a high level
> autonomy, so that
> 1) poems can be transcribed and thus validated as unitary texts
> 2) poems that come in different versions in different bundles / manuscript
> collections can in a later phase more easily be collated against each
It seems to me that 1) is fully met by making each poem a <text> within a
group. I don't really see how treating each poem as a document within a
corpus would make any difference here. At the level of encoding management,
there is no reason why a transcriber-encoder shouldn't work on a document
instance where the <group> has only one <text> member if that seems
desirable. I don't see how 2) is made any easier by a multiple discrete
document approach, rather than a <group>ed mulitple <text> one. So I can't
really see the gains that would justify the strain placed upon the idea of a
corpus by the attempt to accommodate front and back matter at macro level.
Perhaps things would become clearer if you could say more about what you see
as the issues behind
> the encoding of the same texts at different levels (as autonomous texts
> AND parts of a bundle)
and in what respects <group>ing of <text>s is inadequate to meet what you
need to express.