Yes, I agree that the SIG on Libraries might be a good group through which to organize work on this.  In recent years the SIG has only worked on revising Best Practices for TEI in Libraries – a project that Elli, Syd, and I have been working in the past few months to complete.

However, as I wrote to the SIG list back in November, Stefanie and I would like to step back as convenors of this SIG.  No one has volunteered to take over, but I wonder if there are people interested in the topic below that might form a new cohort of SIG members and from whom a new SIG convenor (or co-convenors) might emerge.


On 5/13/18 11:17 PM, Martin Mueller wrote:

This is indeed a very big problem and raises all manner of bibliographical and ethical problems. We’ have pushed that can down the road in our EarlyProject project, which takes as its source a TCP transcription but adds different kinds of values through linguistic annotation and some other steps. I lack the bibliographical or technical expertise to come up with a good solution, but I know it’s a can that we cannot push down the road much longer. It would be good to have  some discussion of this in the Guidelines. There may not be a single solution, but there certainly should be some guidance about what is practical and proper.


Would this be a good topic for the TEI in Libraries group to pick up?  That certainly is a group of people who know a lot about the intersection of bibliography and TEI, which seems to be the sweet or sore spot of this problem.


As more and more digital texts get produced and then repurposed (curated?) into TEI, I am sure more and more people have had to face the problem of how properly to represent their pedigree (the text's, not the people's) in the sourceDesc. But I haven't yet found any very clear indication of recommended practice in this respect, or not one I like much at any rate.

Here's a far from unusual scenario. The project is producing a collection of literary texts in TEI, many of which are already digitized in page image, or HTML, or some other non TEI format. They may even be in TEI, but it's not the same as the TEI we want in our project. The project has defined a rather strict and specific TEI schema, and everything has to be converted to it. Consequently, it needs to record in the sourceDesc up to three bibliographic descriptions -- one for the digital source used, one for the print source from which that digital version derives, and possibly one or more others for sources used to modify the primary digital source.  I don't think it's good enough just to list the three bibls (if bibls they be) because that loses information about the relationships amongst them. So here is an example of how I am thinking of doing this:

<ref target=""> Tatiana Leïlof roman parisien (édition numerisée) </ref>
<publisher> / Bibliothèque nationale de France </publisher>
<idno type="ARK">12148/bpt6k931128v</idno>
<relatedItem type="printSource">
  <bibl><title>Tatiana Leïlof , roman parisien, par Édouard Rod</title>
  <publisher>E. Plon, Nourrit et Cie</publisher>

I think this shows rather nicely that the source of the text in the header of which this appears is a digital text published by the BNF with the identifier shown, the print source for which is the title published in Paris in 1886.`Now suppose that the digital source used for the project has been collated with a (fictitious) 20th c edition to create our new TEI version. I can just add another relatedItem within the outer bibl, distinguishing it by means of its  @type attribute:

<relatedItem type="collatedWith">
  <bibl><title>Tatiana Leïlof , roman parisien, par Édouard Rod</title>
My question to the list is : does this look reasonable? and if you were (or have been) faced with this scenario, how would you deal with it? I know, I know, you'd use RDF. But say you want to humour an old man, and do everythng in TEI :-)