On Jun 5, 2015, at 3:54 PM, Eric Lease Morgan <[log in to unmask]> wrote:
> Can somebody please point me in the direction of acquiring the EEBO files encoded in TEI?
Hmmm… I think I found them on Box:
* P5_snapshot_201501 - http://bit.ly/1FCsAJc
* P5_snapshot_201502 - http://bit.ly/1QcvxLP
As long as I can get everything in one go, and as long as the files are in some flavor of XML (preferably TEI), then I can go to the next step. For example, I can read over each file extracting its bibliographic information to create a “catalog” complete with a pointer to the local file system where the raw XML resides. This provide the means for a searchable/browsable interface. I can then loop over each TEI file extracting the transcribed text to do text mining services — counting words. On my mark. Get set. Go.
Again, thank you for the prompt replies. They were very helpful.