Print

Print


A couple of years ago I gathered some URLs into a project for testing.

https://github.com/stuartyeates/sampler/tree/master/TEI

Cheers
Stuart

On Wednesday, December 21, 2016, MLH <[log in to unmask]> wrote:

> Hi Matthew,
>
>
> Greta Franzini's digital scholarly editions app links to 16 resources with
> downloadable TEI, but I'm afraid it doesn't specify regarding bulk download
> / predictable URLs. Still, it may be worth a look.
>
> https://dig-ed-cat.acdh.oeaw.ac.at/browsing/editions/?name=
> &institution__name=&manager__name=&url=&scholarly=&digital=
> &edition=&writing_support=&begin_date=&end_date=&audience=&philological_
> statement=&textual_variance=&value_witnesses=&tei_
> transcription=&download=1&images=&zoom_images=&image_
> manipulation=&text_image=&source_translation=&glossary=&
> indices=&search=&advanced_search=&cc_license=&open_
> source=&infrastructure=&key_or_ocr=&print_friendly=&api=&
> amount=&Filter=Filter
>
>
> Matthew
>
> ------------------------------
> *From:* TEI (Text Encoding Initiative) public discussion list <
> [log in to unmask]
> <javascript:_e(%7B%7D,'cvml',[log in to unmask]);>> on behalf of
> TEI-L automatic digest system <[log in to unmask]
> <javascript:_e(%7B%7D,'cvml',[log in to unmask]);>>
> *Sent:* 20 December 2016 05:00
> *To:* [log in to unmask]
> <javascript:_e(%7B%7D,'cvml',[log in to unmask]);>
> *Subject:* TEI-L Digest - 18 Dec 2016 to 19 Dec 2016 (#2016-244)
>
> There are 4 messages totaling 541 lines in this issue.
>
> Topics of the day:
>
>   1. seeking links to TEI corpora (3)
>   2. Don't upgrade your Oxygen plugin yet!
>
> ----------------------------------------------------------------------
>
> Date:    Mon, 19 Dec 2016 17:13:29 +0000
> From:    "Lavin, Matthew J" <[log in to unmask]
> <javascript:_e(%7B%7D,'cvml',[log in to unmask]);>>
> Subject: seeking links to TEI corpora
>
> Apologies for any duplicates received due to cross-posting.
>
> I am collecting links for publicly accessible, computable TEI (or other
> similar xml markup such as SGM, LMNL) files. In order to be included,
> archives/collections/datasets/corpora must have meet one of the two
> criteria:
>
> Bulk download of raw xml (not html transformed)
> Xml fully accessible via predictable url structure (an example of this
> would be the Walk Whitman archive, which as a “raw xml” link on every
> transformed html page)
>
> Please note that I am not interested in sample xml, only collections with
> some kind of curatorial or scholarly focus. Thank you all for any leads!
>
> Matthew Lavin
> Clinical Assistant Professor of English and Director of Digital Media Lab
> University of Pittsburgh
>
> ------------------------------
>
> Date:    Mon, 19 Dec 2016 09:33:09 -0800
> From:    Martin Holmes <[log in to unmask]
> <javascript:_e(%7B%7D,'cvml',[log in to unmask]);>>
> Subject: Don't upgrade your Oxygen plugin yet!
>
> Hi all,
>
> We've found a problem with the latest release of the TEI Oxygen plugin,
> so if you have it installed (instead of the regular TEI framework
> bundled with Oxygen), please don't update it to the new release. We're
> working on the problem.
>
> Cheers,
> Martin
>
> ------------------------------
>
> Date:    Mon, 19 Dec 2016 18:05:19 +0000
> From:    "Dalmau, Michelle Denise" <[log in to unmask]
> <javascript:_e(%7B%7D,'cvml',[log in to unmask]);>>
> Subject: Re: seeking links to TEI corpora
>
> Dear Matthew,
>
> The IU Libraries provide XML downloads (at the item-level) for the
> following TEI P5 collections:
>
> Wright American Fiction: http://dlib.indiana.edu/collections/wright/
> Wright American Fiction - Home
> <http://dlib.indiana.edu/collections/wright/>
> dlib.indiana.edu
> Lyle H. Wright, a librarian at the Huntington Library in San Marino, CA,
> created a bibliography of American fiction from the years 1851–1875,
> published as American ...
>
> Victorian Women Writers Project: http://www.dlib.indiana.edu/
> collections/vwwp/
> Victorian Women Writers Project- Home - The VWWP
> <http://www.dlib.indiana.edu/collections/vwwp/>
> www.dlib.indiana.edu
> The Victorian Women Writers Project (VWWP) began in 1995 at Indiana
> University and is primarily concerned with the exposure of lesser-known
> British women writers of ...
>
> Brevier Legislative Reports: http://www.dlib.indiana.edu/
> collections/law/brevier/
>
> We have two additional projects in TEI P4 with XML download:
> Indiana Authors and Their Books: http://dlib.indiana.edu/
> collections/inauthors
> <http://dlib.indiana.edu/collections/inauthors>
> Indiana Authors and Their Books - Home
> <http://dlib.indiana.edu/collections/inauthors>
> dlib.indiana.edu
> Indiana Authors and Their Books is an LSTA–funded project based on the
> digitization and encoding of the 3–volume reference work, Indiana Authors
> and Their Books ...
>
> Indiana Magazine of History: https://scholarworks.iu.edu/
> journals/index.php/imh (XML download in the View Text link per article)
> Indiana Magazine of History - Indiana University
> <https://scholarworks.iu.edu/journals/index.php/imh>
> scholarworks.iu.edu
> Published continuously since 1905, the Indiana Magazine of History is one
> of the nation's oldest historical journals. Since 1913, the IMH has been
> edited and ...
>
>
> You could also grab most of these files via GitHub:
> https://github.com/iulibdcs/tei_text  (caveat: the repo needs to be
> refreshed — on our to-do list)
> <https://github.com/iulibdcs/tei_text>
> GitHub - iulibdcs/tei_text: Free-for-all repository of TEI ...
> <https://github.com/iulibdcs/tei_text>
> github.com
> tei_text - Free-for-all repository of TEI and plain text files for you (to
> do cool stuff) provided by the Digital Collections Services group at the
> Indiana University ...
>
>
> This is probably not what you are after, but we provide EAD XML access to
> IU finding aids as well:
> http://dlib.indiana.edu/collections/findingaids/
> Archives Online at Indiana University
> <http://dlib.indiana.edu/collections/findingaids/>
> dlib.indiana.edu
> Welcome to Archives Online at Indiana University. This site is a portal
> for accessing descriptions of Special Collections and Archives - ones
> chiefly containing ...
>
>
> —Michelle
> -----
> Michelle Dalmau
> Head, Digital Collections Services
> -----
> Indiana University Libraries
> Herman B Wells Library
> 1320 East 10th Street, Rm W501
> Bloomington, Indiana 47405
> -----
> Web:  http://michelledalmau.com
> Twitter:  @mdalmau
>
>
> On Dec 19, 2016, at 12:13 PM, Lavin, Matthew J <[log in to unmask]
> <javascript:_e(%7B%7D,'cvml',[log in to unmask]);><mailto:[log in to unmask]
> <javascript:_e(%7B%7D,'cvml',[log in to unmask]);>>> wrote:
>
> Apologies for any duplicates received due to cross-posting.
>
> I am collecting links for publicly accessible, computable TEI (or other
> similar xml markup such as SGM, LMNL) files. In order to be included,
> archives/collections/datasets/corpora must have meet one of the two
> criteria:
>
> Bulk download of raw xml (not html transformed)
> Xml fully accessible via predictable url structure (an example of this
> would be the Walk Whitman archive, which as a “raw xml” link on every
> transformed html page)
>
> Please note that I am not interested in sample xml, only collections with
> some kind of curatorial or scholarly focus. Thank you all for any leads!
>
> Matthew Lavin
> Clinical Assistant Professor of English and Director of Digital Media Lab
> University of Pittsburgh
>
>
>
> On Dec 19, 2016, at 12:13 PM, Lavin, Matthew J <[log in to unmask]
> <javascript:_e(%7B%7D,'cvml',[log in to unmask]);><mailto:[log in to unmask]
> <javascript:_e(%7B%7D,'cvml',[log in to unmask]);>>> wrote:
>
> Apologies for any duplicates received due to cross-posting.
>
> I am collecting links for publicly accessible, computable TEI (or other
> similar xml markup such as SGM, LMNL) files. In order to be included,
> archives/collections/datasets/corpora must have meet one of the two
> criteria:
>
> Bulk download of raw xml (not html transformed)
> Xml fully accessible via predictable url structure (an example of this
> would be the Walk Whitman archive, which as a “raw xml” link on every
> transformed html page)
>
> Please note that I am not interested in sample xml, only collections with
> some kind of curatorial or scholarly focus. Thank you all for any leads!
>
> Matthew Lavin
> Clinical Assistant Professor of English and Director of Digital Media Lab
> University of Pittsburgh
>
>
> ------------------------------
>
> Date:    Mon, 19 Dec 2016 10:08:46 -0800
> From:    Matthew Davis <[log in to unmask]
> <javascript:_e(%7B%7D,'cvml',[log in to unmask]);>>
> Subject: Re: seeking links to TEI corpora
>
> Dear Matthew,
>
> I don’t know that it’s what you’re looking for (it is still early days,
> there’s still a lot to transcribe and input, and I’m one person doing all
> the work), but I think my archive of Lydgate works may meet your criteria.
> There’s a link to download the xml for each transformed html page, and the
> raw xml files are stored until an XML folder by work.
>
> The link is www.minorworksoflydgate.net <http://www.
> minorworksoflydgate.net/>.  Much of it is still behind a password as I’m
> hoping to have  a peer review done on it, but the items in the Clopton
> chantry chapel (http://www.minorworksoflydgate.net/Quis_
> Dabit/Clopton/ww_qd_1.html <http://www.minorworksoflydgate.net/Quis_
> Dabit/Clopton/ww_qd_1.html> and http://www.minorworksoflydgate.net/
> Testament/Clopton/sw_test_1.html <http://www.minorworksoflydgate.net/
> Testament/Clopton/sw_test_1.html>) are readily accessible since the
> transcriptions will be published in January.  If it’s what you’re looking
> for, send me a message off-list and I’ll give you the password credentials
> for the other items.
> The Minor Works of John Lydgate <http://www.minorworksoflydgate.net/>
> www.minorworksoflydgate.net
> Welcome to the virtual archive of the minor works of the fifteenth-century
> poet, John Lydgate. The goals of this archive are twofold: first, it is an
> ...
> The Minor Works of John Lydgate <http://www.minorworksoflydgate.net/>
> www.minorworksoflydgate.net
> Welcome to the virtual archive of the minor works of the fifteenth-century
> poet, John Lydgate. The goals of this archive are twofold: first, it is an
> ...
>
>
> There’s also a section on the site, “About the Archive,” that articulates
> some of my thinking about site design, the decisions I made while encoding,
> etc.
>
> All the best,
> —Matt
>
>
> > On Dec 19, 2016, at 9:13 AM, Lavin, Matthew J <[log in to unmask]
> <javascript:_e(%7B%7D,'cvml',[log in to unmask]);>> wrote:
> >
> > Apologies for any duplicates received due to cross-posting.
> >
> > I am collecting links for publicly accessible, computable TEI (or other
> similar xml markup such as SGM, LMNL) files. In order to be included,
> archives/collections/datasets/corpora must have meet one of the two
> criteria:
> >
> > Bulk download of raw xml (not html transformed)
> > Xml fully accessible via predictable url structure (an example of this
> would be the Walk Whitman archive, which as a “raw xml” link on every
> transformed html page)
> >
> > Please note that I am not interested in sample xml, only collections
> with some kind of curatorial or scholarly focus. Thank you all for any
> leads!
> >
> > Matthew Lavin
> > Clinical Assistant Professor of English and Director of Digital Media Lab
> > University of Pittsburgh
> >
>
> ------------------------------
>
> End of TEI-L Digest - 18 Dec 2016 to 19 Dec 2016 (#2016-244)
> ************************************************************
>


-- 
--
...let us be heard from red core to black sky