LISTSERV mailing list manager LISTSERV 16.5

Help for TEI-L Archives


TEI-L Archives

TEI-L Archives


TEI-L@LISTSERV.BROWN.EDU


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

TEI-L Home

TEI-L Home

TEI-L  January 2015

TEI-L January 2015

Subject:

Re: about <expan>

From:

Andrew Dunning <[log in to unmask]>

Reply-To:

Andrew Dunning <[log in to unmask]>

Date:

Thu, 8 Jan 2015 17:15:18 -0500

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (174 lines)

What Helena suggests is also extremely practical for medieval Latin manuscripts, which use a relatively small set of abbreviations in thousands of combinations. I have found that this is the most efficient method of transcribing a heavily abbreviated text, but it would be very interesting to know, if anyone has found a method of refining this further (perhaps with a postprocessor). Breaking these down into their constituent parts allows one to use TextExpander on the Mac (AutoHotKey on Windows, etc.) to get encoded versions of most abbreviations with very little typing, doing things like this (the first line representing what I am typing, and the second the result):

.per
:   <expan><am>&#xA751;</am><ex>per</ex></expan>

.pro
:   <expan><am>&#xA753;</am><ex>pro</ex></expan>

.dns
:   <expan><abbr>d</abbr><ex>omi</ex><abbr rend="supraline">n</abbr><ex>u</ex><abbr>s</abbr></expan>

u.ers.us (i.e. ‘uersus')
:   u<expan><am>&#x33E;</am><ex>er</ex></expan>s<expan><am>&#xA770;</am><ex>us</ex></expan>

.concordat.us
:   <expan><am>&#xA76F;</am><ex>con</ex></expan>cordat<expan><am>&#xA770;</am><ex>us</ex></expan>

(Note: I have arranged my TextExpander snippets library so that a leading period designates a manuscript abbreviation, while a comma gives TEI elements, so that for instance typing ,subst will pop up a form asking for the added and deleted text. I can post an exported version of the library if it would be useful to anyone else.)

All best,

Andrew Dunning
PhD Candidate
Collaborative Program in Editing Medieval Texts
Centre for Medieval Studies
University of Toronto

http://andrewdunning.ca

> On 7 Jan 2015, at 3:41 AM, Matthew James Driscoll <[log in to unmask]> wrote:
> 
> I agree that for most purposes just using <am> and <ex> should suffice. I also agree that using <g> is a good solution (though you need to close it in your second example: <g ref="#abper"/>). The MUFI entity name for this, by the way, is "pbardes".
> 
> You could also just use the Unicode character, U+A751 (LATIN SMALL LETTER P WITH STROKE THROUGH DESCENDER).
> 
> super = su<choice><am>&#xa751;</am><ex>per</ex></choice>
> 
> The belt-and-braces approach would be to use both abbr/expan and am/ex:
> 
> <choice>
>  <abbr>su<am>&#xa751;</am></abbr>
>  <expan>su<ex>per</ex></expan>
> </choice>
> 
> All of these assume that the "abbreviation" is the entire character, p with a stroke though the descender, and that the letters "p e r" comprise its "expansion". One could, of course, argue that the "p" really is there, representing itself, and that it is only the stroke which "stands  for" the letters "e r", which are then supplied when the abbreviation is expanded. Which could be marked up like this:
> 
> sup<choice><am>&#x031;</am><ex>er</ex></choice>
> 
> (here using the combing lower macron for the stroke).
> 
> But that's a can of worms probably best left unopened.
> 
> All the best,
> Matthew
> 
> 
> -----Original Message-----
> From: TEI (Text Encoding Initiative) public discussion list [mailto:[log in to unmask]] On Behalf Of Paul Schaffner
> Sent: 6. januar 2015 21:47
> To: [log in to unmask]
> Subject: Re: about <expan>
> 
> I think the resources to do what you want to do are already present in P5, without modification.
> 
> We deal with very similar things, albeit in nonstandard ways. E.g. we define an SGML entity &abper; which represents the brevigraph "per" ('p with a swoosh through it'), and by that means represent
> 
> (super) = su&abper;
> (per)   = &abper;
> (perpetua) = &abper;petua
> 
> In P5 these become
> 
> super = su<g ref="#abper">per</g>
> per = <g ref="#abper">per</g>
> perpetua = <g ref="#abper">per</g>petua
> 
> Note that the brevigraph 'per' can be represented this way without any use of expan or abbr, and without any commitment as to whether the abbreviated item is a word or part of a word.
> 
> The same can be said for an expanded markup using ex and am (i.e., a treatment that makes explicit what is implicit in the previous version), viz.,
> 
> super = su<choice><am><g ref="#abper"></am><ex>per</ex></choice>
> per = <choice><am><g ref="#abper"></am><ex>per</ex></choice>
> perpetua = <choice><am><g ref="#abper"></am><ex>per</ex></choice>petua
> 
> Note that all of these manage without abbr and expan.
> The latter are designed for use when you are trying to capture abbreviated/expanded words as such. Perhaps that is not really what you want to do?
> 
> pfs
> 
> 
> On Tue, Jan 6, 2015, at 15:18, Martin Holmes wrote:
>> Hi Helena,
>> 
>> You didn't get any reply to this on the list, but I wanted to make 
>> sure the query didn't get lost; I think it's a good question. The core 
>> definition of <expan> is just:
>> 
>> <expan> (expansion) contains the expansion of an abbreviation
>> 
>> but the chapter prose does contain the phrase you quote: "It should 
>> always include the whole of an abbreviated phrase or word."
>> 
>> There is one argument for providing the fully-expanded word, unbroken: 
>> if you are using a system which creates a text index of the XML, the 
>> indexer will easily find the complete word (e.g. "perduda") rather 
>> than a sequence of tags which will need to be combined to create the 
>> indexable item:
>> 
>> <choice><abbr>p</abbr><expan>p<ex>er</ex></expan></choice>duda
>> 
>> versus:
>> 
>> <choice><abbr>pduda</abbr><expan>perduda</expan></choice>
>> 
>> where an indexer need only be told to ignore <abbr> in favour of <expan>.
>> 
>> However, I take your point that p -> per is the same phenomenon when 
>> it occurs in multiple contexts, and therefore that it's probably more 
>> logical to encode it the same way wherever it occurs.
>> 
>> People who know more about this kind of text should weigh in on this, 
>> but one option is to customize your schema to create your own 
>> replacement for <expan> which has the semantics you need. On the other 
>> hand, if there is support for the idea that the sentence from the 
>> chapter prose is excessively constraining (thinking for instance of 
>> languages in which "word" is not such a meaningful concept), then we 
>> could raise a ticket to consider rephrasing or deleting it.
>> 
>> Cheers,
>> Martin
>> 
>> On 15-01-04 11:25 AM, Helena B. Sabel wrote:
>>> Dear TEI list,
>>> 
>>> The TEI Guidelines declare that the element <expan> "should always 
>>> include the whole of an abbreviated phrase or word." However, I find 
>>> this definition problematic when editing Medieval Portuguese poetry.
>>> 
>>> In this tradition, most often than not, abbreviations were used to 
>>> avoid writing a certain string of characters, and not specifically 
>>> to abbreviate a word. The same abbreviations occur in all kind of 
>>> positions in the character string and the characters that may 
>>> precede or follow the abbreviation are irrelevant. Consider for 
>>> example three common occurrences of the abbreviation "p":
>>> 
>>> 1) As a standalone string equivalent to the preposition "per".
>>> 
>>> 2) In combination with other characters to create words that have 
>>> the string "per" in them: "pder" (for "perder"), "pduda" (for 
>>> "perduda"), "aptar" (for "apertar"), etc.
>>> 
>>> 3) Adjacent to other characters although they belong to different 
>>> "words" (the use of spaces as word boundaries is inconsistent).
>>> 
>>> Thus, my preference would be to tag "p" when it represents "per" always as:
>>> 
>>> <expan>p<ex>er</ex></expan>
>>> 
>>> without incorporating the rest of the word into the contents of <expan>.
>>> If I were to encode entire expanded words, among other editorial 
>>> conflicts, I would be imposing an anachronistic concept of "word", 
>>> asserting that the word was abbreviated, although there is no reason 
>>> to believe that the scribe thought of writing "p" for "per" as 
>>> abbreviating a word. So my question is: is it appropriate in this 
>>> type of context to talk about expanded segments (not words) and use <expan> "inside" a word?
>>> 
>>> Best,
>>> 
>>> Helena
>>> 
>>> 
> --
> Paul Schaffner  Digital Library Production Service [log in to unmask] | http://www.umich.edu/~pfs/

Top of Message | Previous Page | Permalink

Advanced Options


Options

Log In

Log In

Get Password

Get Password


Subscribe or Unsubscribe

Subscribe or Unsubscribe


Archives

ATOM RSS1 RSS2



LISTSERV.BROWN.EDU

CataList Email List Search Powered by the LISTSERV Email List Manager