I hope this doesn't seem like a betrayal or a deep heresy (my turn to
get the flames!) but I've heard good things about the possibility of
using the METS schema, which allows you to bring together metadata,
text transcription (e.g. in TEI), and multiple images (whether of the
original text or of associated images) and specify their behaviour. I
don't know a whole lot about it but it looks pretty elegant; they're
starting to use it at the Brown digital library projects. Those who
actually know something about METS should speak up and correct or
amplify what I've said.
Would one advantage of this method be that image metadata could be
stored much more richly and conveniently if one didn't have to cram
it into TEI, which doesn't really have provision for this sort of
thing? Presumably one could also encode direct links from the TEI
text to the images, but they wouldn't have to carry the whole
There, I've said enough to get myself in trouble, I guess--
>I'm new here so I hope I'm not re-opening an old wound. But I have
>questions about how to tag a project with both text transcription, page
>images, and meta-data about those images using teixlite. Searching the
>archives of this list I see that some discussion of this topic took place
>several years ago. But I'm unclear whether the changes proposed then have
>In July of '97 a couple of people suggested modifying the pb element to
>include an entity attribute. However, in August of '97 Lou Burnard
>first held that one should always use figure elements for page images.
>However, doing so inevitably leads to tag abuse since <figure> isn't allowed
>directly within <div>s or between them. When pressed Mr. Burnard suggested
>modifications to x.globIncl. However, Michael Sperberg-McQueen suggested
>modifying x.common to include figure, table, and text elements. It doesn't
>look to me like any modifications were made to the teixlite DTD based on
>So I'm still left wondering how to include page images and meta-data about
>them without contorting the markup by adding bogus <p> elements. Is there
>any agreement on this yet?
>How about the following suggestion? Add an entity attribute to the pb
>element to encode the URL (indirectly through the entity) of the image and
>an optional pgDesc element to the content of pb, i.e.
><!ELEMENT pb (pgDesc?)>
><!ATTLIST pb [other attributes]
> entity ENTITY #IMPLIED>
>Meta-data for the page image could be recorded in the pgDesc element
>similarly to the way it is recorded for a figure in figDesc. When no
>meta-data was needed, the pb tag could still be written as <pb />, keeping
>previously encoded instances compatible with the revised DTD.
>Am I way off base here? I'm open to suggestions, flames, etc. Can other
>folks tell me how they're handling this 'problem'?
>[log in to unmask]
>Digital Library R & D
>University of Virginia