Thanks for posting this example. I am a bit puzzled by the fact that it seems to contain two bibliographic descriptions of the same thing (the 1910 edition of the work), one using bibl, and one using biblFull. Is that common practice in the DTA? It seems rather misleading -- if you had a text for which there were two different sources,  would there be *four* descriptions?

The place to indicate the source for a text is clearly the sourceDesc. The idea behind my encoding proposal is to show that the immediate source for my text is the Gutenberg text. So I use a bibl with the title that PG gives its text, and probably other information like the date it was first published in PG could be added. But I also want to show that the PG text itself has a source, and that is the "related item". I might have taken information from that related item (pagebreaks for example) and added them into the PG text. So now I have two sources. My question remains: what's the best way of indicating clearly the relationship between those two sources?

To complicate matters, it's possible that I don't know the source for the PG text, of course, though I do know that it appears to be following the edition I've chosen as my copytext.

On 22/09/18 19:33, Frederike Neuber wrote:
[log in to unmask]">
Dear Lou,
the information about the original source text ist provided in
<sourceDesc>, as in the following example:

    <bibl type="M">Löwenfeld, Leopold: Student und Alkohol. München,
            <title level="m" type="main">Student und Alkohol</title>
                <persName ref="">
            <edition n="1"/>
                <name>M. Riegersche Universitäts-Buchhandlung</name>
            <date type="publication">1910</date>

So basically it is not mentioned here (as in your example) that it is the
project gutenberg-version of "Student and Alkohol", which is also true,
because after the conversion into DTA-base format it is not the gutenberg
version anymore. The information that gutenberg is the source from which
the source derives is, however, already caputered within the <respStmt>s.
So the difference between your encoding and the DTA-encoding is, that you
regard the gutenberg ebook-version as source text (while the original
source is just a <relatedItem>, while the DTA regards the "original" book
as source text. To me the latter seems at first sight more reasonable for
my understanding of <sourceText>, but you might have an explanation for
your choice of encoding <title>The Project Gutenberg EBook of Tono Bungay,
by H. G. Wells</title> instead of <title>Tono Bungay</title> (+ the other


Am Sa., 22. Sep. 2018 um 17:20 Uhr schrieb Lou Burnard <
[log in to unmask]>:

Dear Frederike

I am glad we are in agreement about the use of respStmt here. But how does
DTA record the source text (e.g. printed book) from which a source text
(e.g. gutenberg text) was derived?


Lou (apologetically unable to cope with German)

On 21/09/18 18:13, Frederike Neuber wrote:

 Dear Lou,

in the German Textarchive are often integrated text and image resources
from the www, also from The base format of the German
Textarchiv forsees the provision of <respStmt> as you suggested too.
Compared to your suggestion, they provide several respStmt for tasks that
they defined as seperate; "provision of transcription", "provision of
images", "curation/conversion of data". While the former two most of the
times refer to sources on the WWW, the latter refers to specific persons
who had the task to actually integrate these existing sources into the new
context. Another difference to your suggestion is that the DTA-format
provides a date, to indicate when the transcription/image has been
integrated, which I think is very clever, since content on the internet can
change easy.

Here is a code snippet with a few comments, that might be clearer than any
of my explanations:

    <!-- respStmt for the transcription, taken from -->
            <note type="remarkResponsibility">Bereitstellung der
Texttranskription und Auszeichnung
                in der Syntax von</note>
            <!-- note that during integration the transcription might have
changed -->
            <note type="remarkRevisionDTA">Bitte beachten Sie, dass die
aktuelle Transkription (und
                Textauszeichnung) mittlerweile nicht mehr dem Stand zum
Zeitpunkt der Übernahme aus
       entsprechen muss.</note>
            <!-- source -->
            <ref target="" <>/>
            <!-- import date -->
            <date type="importDTA">2013-03-18T13:54:31Z</date>
        <!-- respStmt fpr the images -->
            <note type="remarkResponsibility">Bereitstellung der
            <ref target="" <>/>
            <date type="importDTA">2013-03-18T13:54:31Z</date>
        <!-- respStmt for the person who converted the gutenberg
source-text to the DTA-Base format -->
            <note type="remarkResponsibility">Konvertierung nach XML/TEI
            <date type="importDTA">2013-03-18T13:54:31Z</date>

In case you speak German, here is also a documentation.
Hope that helps,


Am Fr., 21. Sep. 2018 um 17:47 Uhr schrieb Gioele Barabucci <[log in to unmask]>:

On 21/09/2018 16:35, Lou NoMiddleName Burnard wrote:

I have been thinking about how to represent economically and clearly the
bibliographic status of a digital text which is derived from another
one. For example, consider a Project Gutenberg text which we believe to
be a version of some specific print edition.


Obviously one could add a whole lot more; I am trying to show just the
bare essentials here. What do you think?

Dear Lou,

what about providing a TEI-mapping of PROV-O, the W3C provenance ontology?

PROV-O, for all its shortcomings, already takes into account FRBR,
something your example was pointing to (deliberately or accidentally).
The example for the relation prov:hadPrimarySource is a translated and
formatted text from the Gutenberg project. :)


Gioele Barabucci <[log in to unmask]> <[log in to unmask]>