Oh, absolutely the TEI documents would be useful for the things you mention (and probably a hundred other things that I can't even think of at the moment).
But for Project Gutenberg's purposes, we want to get the texts out into the hands of the general public. The more esoteric academic purposes are beyond our specific goals (not counter to them, just beyond them).
The most accessible formats for our purpose are txt, PDF and html. TEI provides an easy way to maintain a "master" document instead of having to maintain multiple files when typos/fixes are reported later down the road. Our biggest problem is a lack of "knowledgeable" volunteers to work on TEI master documents. HTML and txt layout is a fairly widely available talent pool. TEI not so much.
Wow,Please help me understand the statement below"The biggest benefit we see from TEI ....."My heart was set on the use of TEI as the core representation of text, which after such neutral processing, could then generate files as you mention, BUT then the use of such files would extend to many application areas for extensive analysis like:Library: multi-dimensional analysis of textLegal: Multi-Technique means of searching for precedence based casesSupplyChain: DataMining techniques for cost reductionetc...pjrPaul Jefferson RichardsTeamCenter Engineering / Distributed Integrations"If Necessity is the Mother of Invention, ThenResponsibility is the Father of Accomplishment"Cell: 248 343-4547
From: TEI (Text Encoding Initiative) public discussion list on behalf of Joshua Hutchinson
Sent: Fri 5/5/2006 12:35 PM
To: [log in to unmask]
Subject: Re: Zefania biblesApologies if this arrives twice... I think the first attempt got eaten, but I'm not sure...
On 5/5/06, Lou Burnard < [log in to unmask]> wrote:
That's an excellent question! Outside mainstream academia and the
digital library people, I suppose I would point to the work of the
Gutenberg "distributed proof readers" guys -- who appear to have
developed their own customization of TEI Lite, and some documentation
for it, and who definitely fit the profile you describe. See for example
Here is the most up-to-date converter and docs on the PG TEI efforts:
The biggest benefit we see from TEI (which is admittedly not very widely used yet) is a single "master" file from which we can "autogenerate" the PDF, HTML and TXT file formats we need for presentation to the general public.