Re: [TEI-L] Ideas about cross-archival search and retrieval Hi, Joe.

I remember you from the TEI workshop with Dot and James.

Off the top of my head, I think an adaptation of the structure built by NINES <> might meet (some of) your needs.

Cheers to all,


Deborah K. Wright, PhD
Research Associate and Director, Matthew Prior Project <>
King Library
Miami University Libraries
Oxford, Ohio 45056
email: [log in to unmask]

On 12/18/07 2:55 PM, "Joe Wicentowski" <[log in to unmask]> wrote:

Dear TEI-L,

First, this is my first post to the list.  I'm very new to TEI, but
had a wonderful introduction by Dot and James to TEI-P5 in Maryland in
October.  Greetings to everyone I met at the conference.

I'm writing to the list because I would like to probe TEI-L readers
for their thoughts on an evolving aspect of my digital archive project
(an archive of government documents on U.S. foreign policy). Since
this isn't directly a TEI question, I'd appreciate it if readers could
point me in the right direction:

I would like to find ways to open up archives/repositories of similar
"texts" and "documents" to cross-archival searches. Specifically, I
would like to be able to create ways for end-users to:

1. search across all of the sites providing documents that share
certain metadata elements (official documents, in my case) -- via a
cross-archival search API or data harvesting protocol

2. embed documents (i.e. their different manifestations - text, image,
etc.) from these archives in webpages -- via some URI scheme that
points to specific documents and retrieves page instances

With these sorts of goals in mind, I have been looking into projects
and initiatives that address this kind of problem, such as the Open
Archives Initiative (OAI).  Could TEI-L members with experience with
OAI or other initiatives share their recommendations and impressions
of these projects?  Are there any projects or models for
finer-than-google cross-archival search and embedded linking?

This might lead back to a more general discussion about TEI
publishing: Once we have our texts coded and want to share this
contribution with the world, how do we ensure that people "out there"
can find them easily and search common metadata well?  How to make
them available outside the boundaries of our individual archives and
yet make them searchable in a more fine-grained way than google
allows?  Have communities of like-genre encoders (such as poetry
encoders, letter encoders, oral history transcript encoders,
manuscript encoders) or like-topic archivists found each other and
found ways to make their work more easily discoverable?

Let me add some context for my question. My office currently publishes
book-length compilations of documents, and we post these texts online
in simple electronic (HTML/PDF) form (see
<>). We are planning an upgraded
version of the website, with improved search capabilities and a
TEI-based document format. As part of this process we've discovered
that we are one of many 'publishers' of U.S. government documents
online. Others include the Digital National Security Archive
<>, the National Archives' AAD
project <>, various agencies' FOIA
(Freedom of Information Act) websites, etc. In other words, we're
aware that we're only one of many sources to the kind of document we
publish. Furthermore, it would be ideal if students or scholars could
make their own repositories. Our collective endeavor to share key
documents explaining government policy would be greatly strengthened
if there were a way to search across them all.

Thanks in advance for any pointers or thoughts you might have,