Hi Shirley,
> First, I am a TEI novice, having recently bought the two-volume
> Guidelines for Electronic Text Encoding and Interchange, March 2002.
welcome to the club :-) I wish UVpress would at last deliver mine.
> I've been told that TEI might be suitable for a project that I will be
> working on, bringing to the web Charles Olson's transcriptions of
> bibliographic information on the volume, content and location of Herman
> Melville's annotations and reading marks, which Olson transcribed on 5x7
> note cards. Someone will be transcribing Olson's note cards and will add
> his comments about each card. Two sample transcriptions are listed
> below, and I am including a URL to the scanned cards
> (http://www.lib.uconn.edu/~rwitthus/OlsonMelville/scans.htm).
Interesting project. How many of these cards are there? How much time
can you afford on the project?
> I would appreciate it if someone could let me know if TEI can be used
> for these transcriptions. Since each card's information varies so much,
> I'm unclear how I would apply the tag sets for Transcription of Primary
> Sources, which I'm guessing is what I need to use. At this point of the
> project, I'm pretty much groping in the dark for direction, and would
> appreciate any help you can give me. I do know that I want the
> information that's presented on the web to be cross-browser compatible.
> I will be using server side scripts to transform the XML into HTML.
I think (and that is just my personal opinion) you should know what the
TEI tag sets have to offer, but still keep an independent head. First,
you don't have to use everything that the TEI offers and suggests, and
second, you can add your own tags and structure if you find the TEI is
not offering what you need.
It is good to use the TEI as a starting point, though, because it is a
widely known and accepted standard, and because a lot of thinking has
been done to create it, and you can benefit from that for free while
concentrating on the specifics of your project.
The first questions I would ask myself are:
1) What do I want to encode explicitely? Which of that is high
priority, what is less important?
2) What should the web presentation look like? Are there other
presentations intended? What features do I need encoded as a basis for
the web presentation?
3) How much time can I spend on this? Can the coding be done in
succesive passes, important things first? Is it helpful to encode just
a few cards fully, and try the whole process with them? (DANGER: you
might never get to do the job for all of them, and would be left with
another "promising prototype" edition)
*
I don't know much about the goals of your project, and I haven't thought
deeply about this, I give the following answers to these questions just
as an example:
to 1:
- Detailed transcription of physical appearance doesn't matter, since
the facsimiles are there. The main thing is that the text is there in
a readable, searchable format. It may be a dangerous one-way-alley to
transcribe by mimicking the layout with "ASCII-Layout" (and HTML-tag
<pre> in mind). It will blow up soon with tags you add, and the point
of the TEI is _explicit_ encoding. Rather, just encode using <p> in
the beginning, and maybe refine later with suitable tags, for example
for the arranged lines
H. Melville
Jan. 16, 1869
N.Y.
- The overall structure with necessary (bibliographic) meta-information
will be encoded with TEI-tags (TEI Header with bibliographic
information, <div type="card" id="card_taylor01">, ...).
- Annotations, appendices, etc. can be TEI-encoded and linked to the
text with XML/TEI means (IDs / IDREFs).
Nice to have:
- references to names and books tagged so that they can be recognized /
searched / etc. Names normalized, or even resolved to an external
list of persons / books. Initially, it is helpful just to clearly tag
such things, even if you don't use the tagging yet in very
sophisticated ways. Marking names with <name>...</name> is a good
starting point for more refined work on the second or third round of
work.
- more refined transcription; marking of unclear / erased passages,
special layout, ... one can get easily carried away though, by what
_could_ be encoded. Keep time budget and overall purpose in mind.
- it can be helpful to at least mark critical spots, so they don't get
lost; also, a note could be generated in the output (like, "see
facsimile for presentation format"). Such a marking could be done
like this:
<span type="troublespot" id="trouble_001">H. Melville | Jan 16, 1869
| N.Y.</span>
to 2:
- let's say, I want to be able to leaf through cards either by
transcription or by image,
- open up the other form in a separate browser window with one click
(hope there's no patent on this :-),
- see an overview of all cards in the form of a list (plus maybe
thumbnail pictures?)
- text-search through the complete transcriptions,
- read the bibliographic meta-information,
- read the added structure (list of people / books) and use it to access
the material.
Consequences for the encoding? Almost none; on second thought it might
be useful to have a structure like this, to bundle images and editorial
material with the text of the cards:
<div type="card_collection">
<div type="card">
<div type="image">
...images in several resolutions...
</div>
<div type="transcription">
...transcribed text, initially simply in <p>s...
</div>
<div type="annotations">
...if needed, editorial comments per card...
</div>
</div>
...more cards...
</div>
<div type="appendix">
..list of people/books, etc....
</div>
to 3: After the effort for designing the overall structure is done (2-4
weeks) and two cards are encoded (1 week), it might be time to
build the whole system (3-6 weeks) and see that it works as
expected (0-4 weeks), and that nothing important was forgotten
(0-4 weeks). The encoding of the remaining cards should then not
cost too much time per card (1-12 hours) for the simple initial
encoding. ((estimates very spontaneous, not to be taken serious))
A second pass over the material will probably lead to revised
structure (1-3 weeks), improvement of the presentation system (2-8
weeks) and adding of encoding (1 hour per card).
Making everything _really_ work without obvious bugs certainly
needs time as well. (2-8 weeks).
This improvement cycle can last as long as the funding lasts :-)
Searching is an issue of its own and can be delegated or
postponed. Maybe an embedded Google search on the finished web
site is enough, though it looks not very scholarly.
*
Uff, half an hour later... I realize that I said little about the TEI,
and probably did not answer your question. Maybe it's useful anyway.
By the way, my short answer to the question in your subject line is: Yes.
Good luck, enjoy the coding,
Tobias
--
.............................................
(_) Tobias Rischer
"===' [log in to unmask]
" "
...still.loving.GNU..........................
|