Yes, those are the snapshots I had in mind. The later one replaces the
earlier one. I.e. each one contains everything, albeit at two different
I've also just now dropped a wget script for those who 'just want the
files' and don't want to bother with creating a local git repository
for them; in the same folder, formatted and named as a Windows batch
file for those who think that way (and who have a Windows version of
wget on their Windows path); and named
-- probably identical to the script I used to create those snapshots to
On Fri, Jun 5, 2015, at 16:40, Eric Lease Morgan wrote:
> On Jun 5, 2015, at 3:54 PM, Eric Lease Morgan <[log in to unmask]> wrote:
> > Can somebody please point me in the direction of acquiring the EEBO files encoded in TEI?
> Hmmm… I think I found them on Box:
> * P5_snapshot_201501 - http://bit.ly/1FCsAJc
> * P5_snapshot_201502 - http://bit.ly/1QcvxLP
> As long as I can get everything in one go, and as long as the files are
> in some flavor of XML (preferably TEI), then I can go to the next step.
> For example, I can read over each file extracting its bibliographic
> information to create a “catalog” complete with a pointer to the local
> file system where the raw XML resides. This provide the means for a
> searchable/browsable interface. I can then loop over each TEI file
> extracting the transcribed text to do text mining services — counting
> words. On my mark. Get set. Go.
> Again, thank you for the prompt replies. They were very helpful.
> Eric Morgan
Paul Schaffner Digital Library Production Service
[log in to unmask] | http://www.umich.edu/~pfs/