I expect to have a tokenized and linguistically annotated version of Phase I and Phase II by mid July. I assume Notre Dame has the rights to both. If so, you'll be welcome to both. Have you worked with Matt Wllkens?

From: Eric Lease Morgan <[log in to unmask]>
Reply-To: Eric Lease Morgan <[log in to unmask]>
Date: Friday, June 5, 2015 at 2:54 PM
To: "[log in to unmask]" <[log in to unmask]>
Subject: eebo

Can somebody please point me in the direction of acquiring the EEBO files encoded in TEI?

I’m looking to provide interesting text mining services against the content found in EEBO, but I’m having a difficult time reverse-engineering the data I’ve been given. After a bit of investigation, I think the data I have is dated because I got it more than eighteen months ago and the files, while in both SGML and XML, are not in TEI. Instead, I have a jumble of lists with identifiers, header files with really basic metadata, and two sets of encoded text (one in SGML and the other in XML). 

Since my acquisition of the files, I believe the EEBO Phase I files have been released as real and true TEI. Do you know anything about this? If so, do you know where I get such files? Yes, my institution is a member of the TCP. 

Eric Lease Morgan
Digital Initiatives Librarian

University of Notre Dame
Room 131, Hesburgh Libraries
Notre Dame, IN 46556
o: 574-631-8604
e: [log in to unmask]