To be clear: we have had <facsimile>/<surface>/<zone> for a lot longer
than we've had <sourceDoc>. It's been possible for many years to
describe the physical structure of a document in surfaces and zones with
dimensions etc., and link the textual encoding to it. What's new in
<sourceDoc> is the ability to meld transcription with physical
description (transcribed text inside <zone> elements etc.)
The distinction between <facsimile> and <text> was a comfortingly clear
one; <sourceDoc> is clearly more problematic.
Another issue that hasn't really been addressed is this:
If you start to allow structural markup such as <list>s inside
<sourceDoc> elements, you're going to have to decide, at every level,
what nests inside what. For instance, should an <item> appear inside a
<line>? What if it wraps to the next line? Is that two <line>s inside
one <item>, or two <item>s linked with @next/@prev inside discrete
<line>s? You would have to privilege one hierarchy, and presumably it
would be the genetic hierarchy (otherwise you'd be using <text>); the
other hierarchy would end up horribly fragmented, and its own natural
relationships (names containing name parts, addresses with addrLines,
etc.) would be increasingly messy.
On 14-02-05 07:30 AM, Peter Robinson wrote:
> We have, for ages now, related occurrences of <lb/> with xy coordinates,
> then processed them so we can say “this piece of text, occurring after
> <lb n=“1”/> and before <lb n=“2”/> occurs within these xy coordinates in
> this image of this page”. However, this won’t work for the case Lou
> cites, of associating text with a random space in the manuscript (though
> one might, with some ingenuity, use milestones to mark out the range of
> text, and then link the milestones to co-ordinates).
> I think Oliver, myself, and many others, are looking for the same thing:
> a means of encoding the text as document (pages, lines, etc) and the
> text as communicative act (with <div> <rs> and all that) in a single
> encoding. Before the existing Chapter 11, you could do this from one
> direction only: use <text> etc, with <pb/> etc. Now we have the
> surface/zone system, and people would like to do this from the other
> direction. The argument for working from the <text><pb/> direction is
> that it fits a very large number of circumstances, perfectly
> satisfactorily. I’d stand by my comment, that the surface/zone system
> was devised for genetic editions (as its history shows), and is highly
> suited to that context. That is not to deny that surface/zone are
> “only” useful in encoding genetic editions — just that using them in
> (say) transcriptions of Canterbury Tales manuscripts would mean that
> significant aspects of the text as communicative act could not be
> encoded within that transcription.
> On 5 Feb 2014, at 09:02, Oliver Gasperlin
> <[log in to unmask]
> <mailto:[log in to unmask]>> wrote:
>> Just to the point.
>> That is, why I consider the idea of simly using <text>, <pb/> and <lb/> a
>> bit counterproductive.
>> -----Ursprüngliche Nachricht-----
>> Von: TEI (Text Encoding Initiative) public discussion list
>> [mailto:[log in to unmask]] Im Auftrag von Lou Burnard
>> Gesendet: Mittwoch, 5. Februar 2014 14:28
>> An: [log in to unmask] <mailto:[log in to unmask]>
>> Betreff: Re: [TEI-L] Embedded transcription and text structure
>> I wouldn't dream of disagreeing with either Sebastian or Peter. But I
>> also think it's a bit of a misrepresentation to say that surface zone
>> etc. are only useful for the encoding of genetic editions.
>> They also provide the very generally useful ability to define a two
>> dimensional grid and to map specific parts of your transcription to it.
>> This facility was there of course long before the genetic workgroup
>> started using it as well.
>> And I can't think of any obvious or even non-obvious way of saying "this
>> bit of text transcribes a block of text in the top right hand corner" or
>> "this bit of text is written at right angles to the rest" without it.
>> Good luck doing that with <lb/>.
>> On 05/02/14 13:14, Peter Robinson wrote:
>>> Indeed, as Sebastian points out: this is really a case (yet again) of the
>> endemic difficulty XML (and its parent, SGML, and any system based on the
>> OHCO thesis) has with overlapping hierarchies.
>>> Agreeing twice with Sebastian in a single email: the long-known and
>> well-established system of using <text> with <pb/>, <cb/> and <lb/> etc,
>> remains available, and this will allow embedding of <list>, <person>,
>> <note>, etc, in all the usual places. And, as James points out, we have
>> decades of examples of ways of manipulating <pb/> and its friends to
>> page-by-page, even line-by-line, views, as you wish. Our own Textual
>> Communities project takes this approach, and builds mechanisms for viewing
>> documents by both hierarchies (page/column/line; text/div/p or l etc)
>> into its base.
>>> In my view, one of the problems here is that the wording and presentation
>> of the current chapter 11 of the Guidelines implies that the system of
>> <surface>, <zone> etc should be used for transcription of ALL primary
>> materials. Thus its first sentence:
>>> "This chapter defines a module intended for use in the representation of
>> primary sources, such as manuscripts or other written materials. "
>>> In fact, this system is ideally suited for one rather narrow editorial
>> circumstance: the transcription of "genetic editions", typically (almost
>> exclusively) or modern authorial manuscript materials, where the exact
>> representation of the writing process in a single document, as a record of
>> the author's developing inscription of his work, is so crucial as to trump
>> every need. Thus the genesis of the surface/zone system, in the work
>> of the
>> TEI workgroup on Genetic Editions, chaired by Fotis Iannidis and with
>> Pierazzo a.o. among its members (see
>> http://users.ox.ac.uk/~lou/wip/geneticTEI.doc.html). One might presume
>> that, encouraged by the Guidelines' presentation of this system as
>> appropriate for ALL transcription circumstances, people are using it to
>> transcribe manuscripts, etc, which are not instances of genetic editions -
>> hence the repeated requests on this list for a loosening of the content
>> models for <line> etc.
>>> So, in contrast to Oliver's feature request for a loosening of the
>> model: I'd agree with the various arguments against this. I don't
>> think the
>> floodgates will open and civilization will drown if we permit <rs> within
>> model.linePart. But I do think people will tie themselves up in knots
>> trying to use a system devised for genetic editions when transcribing
>> medieval manuscripts. The problem would be considerably eased, I
>> think, if
>> the guidelines made it clearer that the system described in Chapter 11 is
>> really only to be used in the specific circumstance for which it was
>> devised: for encoding of genetic editions. In addition, this chapter
>> also point out (as it currently does not) that the system of
>> text/pb/lb etc
>> is efficient and adequate for many transcription situations, and
>> particularly for projects which wish (as most do) to represent both the
>> intellectual structure of a text and the disposition of that text in a
>> document in a single encoding. Seems to me that this might be put into a
>> feature request.
>>> On 5 Feb 2014, at 06:31, Sebastian Rahtz <[log in to unmask]
>>> <mailto:[log in to unmask]>>
>>>> On 5 Feb 2014, at 12:18, James Cummings <[log in to unmask]
>>>> <mailto:[log in to unmask]>>
>>>>> On 05/02/14 11:56, Lou Burnard wrote:
>>>>>> I find it slightly surprising though that no-one has yet proposed
>>>>>> permitting <surface>, <zone> and friends to appear within <text>
>>>>>> as an alternative to <div>. That would seem a neater way of
>>>>>> having your cake and eating it than any of the proposals so far
>>>>> For some reason that worries me. Perhaps as a muddying of the waters
>> even further? It would require changing the content model of zone
>> significantly to make it useful, wouldn't it?
>>>> I agree, just putting surface and zone into <text> generates a whole
>> more problems than
>>>> it solve. Anyway, we all_know_ there is no clean solution in XML to
>>>> two hierarchies at the same time.
>>>> the old method of interspersing your <text> with <milestone>
>>>> elements (or
>> its specialisations <pb>, <lb> etc)
>>>> hasn't gone away.
>>>> Sebastian Rahtz
>>>> Director (Research) of Academic IT
>>>> University of Oxford IT Services
>>>> 13 Banbury Road, Oxford OX2 6NN. Phone +44 1865 283431
>>>> Não sou nada.
>>>> Nunca serei nada.
>>>> Não posso querer ser nada.
>>>> À parte isso, tenho em mim todos os sonhos do mundo.
>>> Peter Robinson
>>> Bateman Professor of English
>>> #311, Arts Building, 9 Campus Drive, University of Saskatchewan
>>> Saskatoon SK S7N 5A5, Canada
>>> ph. (+1) 306 966 5491
> Peter Robinson
> Bateman Professor of English
> #311, Arts Building, 9 Campus Drive, University of Saskatchewan
> Saskatoon SK S7N 5A5, Canada
> ph. (+1) 306 966 5491