Hi, Mike— Your project looks fascinating! I’m going to try to address your questions in reverse order:
I believe the problem you’re seeing with the <seg> element is a problem with XML well-formedness: You’ve nested your opening and closing tags for seg inside two different paragraphs to produce what we call “tangled tags”—like so: <p><seg></p><p></seg></p>. To the XML parser, this disrupts the XML hierarchy: an element set inside a <p> is expected to open and close inside the <p> element.
Very well, you might say, how about if I change the hierarchy, and set <seg> outside of those <p> elements: <seg><p></p><p></p></seg>? Well, the TEI schema will fire an error here because seg isn’t allowed to contain <p> children. This is a pretty common issue in our community, and there are a variety of ways to deal with it: it’s a problem of how to write good XML markup that accommodates overlapping hierarchies. It’s going to take some planning. I might be tempted in this case, if you’re really liking <seg>, to work with it like so:
<p>….<seg xml:id=“a1” next=“#a2">…</seg></p>
<p><seg xml:id=“a2” prev=“#a1">…</seg>…</p>
Here I’m using an @xml:id to set unique identifiers on each seg, and I’m using @next and @prev to point to the members of a series that span multiple paragraphs in the document. That’s one way of approaching the problem, but there will certainly be others.
Now, as for <interp>, your use of this has a certain logic but isn’t consistent with the TEI Guidelines’ explanation and examples, where the element isn’t being used as markup for base text. Instead, <interp> is basically part of a little family of elements (with <spanGrp> and <span> and more) that are for handling what we call “stand-off” annotation”, for analytical notes with a set vocabulary that you’re appending and attaching usually to a base text. This is a little difficult to explain, so first take a look at http://www.tei-c.org/release/doc/tei-p5-doc/en/html/AI.html#AISP
Do you see how that’s being used?
I’d say we don’t want <interp> here, unless you want to come up with a use of <spanGrp> as a means of handling your annotations. That’s worth considering, too!
But I think you might continue simply using the <seg> element in place of how you’re using <interp>. Use your @type to set up a series of set types for seg, when they point out things you care about. Presumably seg elements contain long spans of text that contain information of various kinds that you’re wanting to highlight. What are those various kinds of information? You could come up with @type and @subtype categories to apply to that element.
I hope this resolves the problem and gives you some ideas!
Elisa Beshero-Bondar, PhD
Director, Center for the Digital Text | Associate Professor of English
University of Pittsburgh at Greensburg | Humanities Division
150 Finoli Drive
Greensburg, PA 15601 USA
E-mail: [log in to unmask]
Development site: http://newtfire.org
I'm involved in a project in which we're marking up a large body of literature and we want to mark various passages in the texts as significant, in essence labeling them as "this" or "that". We took a look through the TEI Guidelines and decided to use <seg> and <interp> to mark the various passages with different attributes to specifiy what type of passage it is.
For example, with <interp>:
In the <interp type="placeMain">Country of the
Bhargas, on Mount Śuśumāra in a fearsome forest of wild animals</interp> with a <interp type="audienceGen">great saṅgha of about 500 monks, eminent śrāvaka-elders who possessed clairvoyance</interp>.
And with <seg>:
function="modul" type="pastLifeWho">if you wonder whether
the brahmin boy
Bhadraśuddha was then at that
time someone else, or you are of two minds
about it, or doubtful, do not
see him so. Why? Because the bodhisattva
mahāsattva Maitreya himself
was then at that time the brahmin boy
1) Since this is going to be a very long term and labor extensive project, I wanted to check and see if the community in general felt this was a reasonable way to mark these passages and also if there are any suggestions for other ways to do this which might work better. Can anyone suggest any other elements that might be useful for this kind of thing?
2) We have a problem of going across elements (breaking the nesting so to speak). For instance, a passage might start in the middle of one paragraph and finish halfway through another paragraph, and when we mark it the tag begins in one paragraph and closes in the next. The schema doesn't like this one bit, and I'm wondering what is the best way to handle this. For example:
<p><seg function="modul" type="qualitiesBuddha">The Tathāgata was handsome and charismatic,
controlled in his faculties and in his mind. He had attained excellence in
control and calm abiding, and superiority in control and calm abiding. He
guarded his faculties, elephant-like in control of his passions, and
was radiant, unsullied, and clear like a lake.</p>
<p>His body was adorned with the
thirty-two marks of a great being, and with the eighty minor marks, like the
blossoming flower of a royal sal tree, and towering like Mount Meru, the
king of mountains. His face was as calm as the sphere of the moon, and
radiantly clear and brilliant like the sphere of
the sun. His body was proportioned like a nyagrodha tree, blazing with light
and great splendor.</seg></p>
Any help is greatly appreciate. Forgive the formatting