On Fri, 2 Mar 2001, Michael Fraser wrote:
> > Alas, yes. The Thesaurus Linguae Graecae (http://www.tlg.uci.edu) has a
> > huge, beautiful database of Greek texts in a proprietary markup scheme
> > called Beta-code (described at the above URL). A whole cottage industry
> > has grown up around this, making programs that can read and search the
> > TLG's CD, because every classicist wants to use these texts -- the TLG
> > has *all* of ancient Greek literature, including fragmentary texts and
> > obscure authors.
> Let's be fair. TLG started in 1972, they could hardly have considered even
> (S)GML back then given that it was unlikely that GML had trickled down
> from IBM to humanities projects in that short space of time. Anyway,
> hasn't the Perseus Project converted many of the TLG texts to TEI whilst
> retaining the TLG's beta code as a transliteration scheme (indeed, beta
> code is (or was) the de facto standard for transliterating ancient Greek
> in electronic documents)? At least the beta code scheme is *documented*
> and pretty much human-readable (which has enabled the 'cottage industry'
> of applications to develop). Perhaps interesting to speculate whether the
> same growth of applications would have occurred if the TLG had coverted
> their texts to TEI (i.e. is the development driven simply by the demand of
> classicists or does the encoding scheme used by the TLG make it relatively
> simple to develop applications to search and browse the collection?).
The TLG had an opportunity to do just that, in 1994. I had just started
working for Susan Hockey at CETH (the Center for Electronic Texts in the
Humanities), now defunct. My first big project was to write a filter to
convert the markup portion of Beta code, which uses a COCOA-like set of
"event markers" to indicate points of change in a document's logical
structure, to a more tree-like model, and from there to SGML that conformed
to the TEI DTD.
The markup portion of beta code is very poorly documented. Moreover,
knowing beta code is not enough. The text ships on a CD, in binary format,
which, for me at least, despite repeated requests, remained undocumented.
Throughout the project, staff at the TLG were either uninterested or
hostile to our work. Though the filters were a success, they were never put
to any use other than as a prototype demo within CETH.
- Gregory Murphy <[log in to unmask]>
__o Software Engineer
_`\<,_ Solaris Software
(*)/ (*) Sun Microsystems