Print

Print


Hi all,

A while back, I asked about the creation of a fairly large and
wide ranging text archive at TEI for developers to use.  It was
discussed back then and there were some concerns raised.  I also
noticed that the issue was raised again at the Presentation
SIG at Sofia:
  http://www.tei-c.org/Activities/SIG/Presentation/pt02.xml
We did indeed assemble a test collection of materials in 
XML/SGML from a number of kind contributors, including (but
not limited to) DocSouth, Brown WWP, the Sanger archive, and
the Nameless Shakespeare (thanks guys) as well as resources like 
EEBO-TCP to which we have access and local materials.   I was very 
happy to have this material to serve as a development target.  

I agree completely with Lou and Grace and would again propose to
the TEI that it consider creation of a fairly large test sample
of materials -- old and new, lite and heavy, simple and complex,
SGML and XML -- to serve as a test suite for developers.  Having 
worked with TEI encoded materials from many projects, I can attest 
to the considerable variability of TEI conformant documents, between 
projects and even between documents from individual projects.  
This is particularly true in terms of header contructs and objects
like notes, pointers, etc.  I suspect that the collection created
here from those projects I know of, contacted, and got a favorable
reply is far from a representative sample -- almost all Americans.  
I think the TEI and users of TEI encoded resources would benefit greatly 
from as wide a collection set as can be reasonably assembled.  

Finally, the TEI might also include TEI variants such as CES and 
MEP, which I also put in my test suite.  

A bientot!

M


>> Date:         Wed, 1 Mar 2006 22:26:12 +0000
>> From: Lou Burnard <[log in to unmask]>
>> Subject: Re: Typical TEI files
>> Comments: To: [log in to unmask]
>> To: [log in to unmask]
>> 
>> Mailfilter passed on match from: [log in to unmask]
>> 
>> Thank you Grace! That's exactly what I was hoping someone would say -- 
>> and, by the way, if you (or someone else) wants to work on enhancing the 
>> metadata on that freely available website, or on persuading more of the 
>> projects listed there to make some sample TEI data freely available from 
>> the site, I know that the TEI Board would love to hear from you! Maybe 
>> the SIG on TEI in Libraries would be interested too.
>> 
>> 
>> 
>>   Grace Wiersma wrote:
>> 
>> > I'd like to add a caveat and question to this thread.
>> > 
>> > For the benefit of the wider language resources and digital libraries
>> > movements, I would plead that the TEI community avoid tailoring lists of
>> > examples/illustrations as private communications to any particular digital
>> > library vendor. Helping one powerful vendor to bring a TEI-conformant and
>> > Unicode-aware product to market could create difficulties for researchers
>> > who, for lack of resources or because of their affiliation, do not have
>> > access to that particular DL product.
>> > 
>> > There are already pointers/references to 126 "Projects Using the TEI"
>> > (freely available at http://www.tei-c.org/Applications). I wonder why the
>> > existing metadata could not be captured and exploited to provide a working
>> > union catalog of TEI-encoded texts, illustrating a diverse range of document
>> > types, and as much of the full Unicode standard as is currently known to be
>> > implemented, and then made searchable from the TEI Consortium site, to
>> > provide an open information resource for software developers, digital
>> > library projects, & diverse independent researchers.
>> > 
>> > Isn't someone interested in doing this?
>> > 
>> > Grace Wiersma
>> > Cataloging & Metadata Services
>> > MIT Libraries
>> > 
>> > -----Original Message-----
>> > From: TEI (Text Encoding Initiative) public discussion list
>> > [mailto:[log in to unmask]] On Behalf Of John A. Walsh
>> > Sent: Wednesday, March 01, 2006 2:54 PM
>> > To: [log in to unmask]
>> > Subject: Re: Typical TEI files
>> > 
>> > Hi all,
>> > 
>> > I'd like to advise the TEI community to respond broadly to this  
>> > request from DiMeMa, Inc.  CONTENTdm is widely used in the library  
>> > communities, and it would be a great help to the TEI community to  
>> > have good support in CONTENTdm for TEI.  Ideally, we can supply them  
>> > with plenty and diverse documents, so they are aware that the TEI  
>> > world, even the library TEI world, is not restricted to TEI Lite.  It  
>> > would also be useful to send them examples representing a variety of  
>> > languages/scripts with lots of non-Latin Unicode ranges represented.   
>> > I'll be sending examples from our Swinburne Archive and other  
>> > projects and encourage others to participate in this effort.
>> > 
>> > John
>> > --
>> > | John A. Walsh
>> > | Associate Director for Projects and Services, Digital Library Program
>> > | Associate Librarian, University Libraries
>> > | Adjunct Associate Professor, Department of English
>> > | Indiana University, 1320 East Tenth Street, Bloomington, IN 47405
>> > | Voice:812-855-8758 Fax:812-856-2062 <mailto:[log in to unmask]>
>> > 
>> > 
>> > 
>> > On Mar 1, 2006, at 12:50 PM, Lou's Laptop wrote:
>> > 
>> > 
>> >>The TEI Applications page http://www.tei-c.org/Applications/ is  
>> >>probably the best place to start looking for such materials.
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>Peter MacDonald wrote:
>> >>
>> >>
>> >>>DiMeMa, Inc., the developers of CONTENTdm, which is a popular  
>> >>>digital collection management system, has shown an interest in  
>> >>>supporting documents encoded in TEI. They asked me to send them  
>> >>>some examples of TEI-encoded documents for their developers to  
>> >>>examine.
>> >>>
>> >>>The only TEI-encoded documents I have to send them from my own  
>> >>>collection are some manuscript letters and some journal articles.  
>> >>>I am not very experienced in TEI encoding and hesitate to send  
>> >>>them my own files.
>> >>>
>> >>>I'd really like to send them a link to a more "official" set of  
>> >>>TEI-encoded pages that can demonstrate to DiMeMa the wide range of  
>> >>>document types that can be encoded in TEI and that have been  
>> >>>encoded by experienced users of the TEI.
>> >>>
>> >>>Can anyone point me to such a page or Web site?
>> >>>
>> >>>If not, I guess I could just send them links to guidelines such as  
>> >>>the following:
>> >>>
>> >>>http://etext.lib.virginia.edu/standards/tei/uvatei.html
>> >>>
>> >>>or, point them to pages from the TEI P4 guidelines that discuss  
>> >>>various kinds of documents:
>> >>>http://www.tei-c.org/P4X/DS.html (default structure)
>> >>>http://www.tei-c.org/P4X/PR.html (prose)
>> >>>http://www.tei-c.org/P4X/VE.html (verse)
>> >>>http://www.tei-c.org/P4X/DR.html (drama)
>> >>>http://www.tei-c.org/P4X/TS.html (speech)
>> >>>http://www.tei-c.org/P4X/DI.html (dictionaries)
>> >>>http://www.tei-c.org/P4X/TE.html (terminological databases)
>> >>>
>> >>>FYI: DiMema's Home page: http://www.dimema.com/index.html
>> >>>
>> >>>Thank you,
>> >>>Peter
>> >>>
>> >>>Peter MacDonald
>> >>>Library Information Systems Specialist
>> >>>Hamilton College Library
>> >>>315 859-4493
>> >>>315 859-4578 (fax)
>> >>>
>> >>>
>> > 
>> > 
>>