At 04:50 PM 1/9/2008, you wrote:
>just curious,
>what are your ideas on marking these words in a way that
>maintain all their possible reads ?
>  * s/he    [she or he]
>  * media/tion media(tion) media-tion  [media and mediation]
>  * car(s)   [car or cars]

If I understand your question correctly, these examples illustrate a 
phenomenon which, while it is not at all unusual in text, defies 
simple handling in markup. That is because this sort of thing goes 
beyond the simple application of text as a sequence of characters 
that together, according to a set of fairly simple rules (such as, in 
modern European literacies, simple serial assembly of characters from 
left to right), encode representations of some other set of entities 
(whether you take that other layer to be words, sentences, utterances 
or what have you). They "escape" the base rules, that is, and invoke 
other rules, which usually have to be inferred by a human reader in 
order to be properly construed.

Another way of putting it is that this kind of thing steps across a 
line between text as (simple) text, and text as (complex) notation, 
or indeed, text as markup of text. Of course, scholars of text can 
also remind us that there is no such line, or at any rate not 
"naturally": to the extent there is a line there, it is only by 
virtue of regularities imposed from without or observable across a wide domain.

The usual way of handling this sort of thing in the current practice 
of markup is to expand the notation explicitly into what it 
represents, using markup to note both the expanded or implicit form, 
and the form as instantiated. How that particular expansion is made 
will depend on the details of the notation, both its form and its 
(presumed) purpose and intention. TEI offers plenty of examples of 
this sort of thing in its handling of abbreviations, regularized forms, etc.

>(i don't know how to call this kind of phenomena .. overlapping text ?)

I wouldn't call it overlapping text as such, although such notations, 
the marking up of such notations, and/or the representation of 
notational conventions in the form of markup, will frequently involve 
overlap -- which is one reason why expanding it into a more prolix 
form is usually the path of least resistance in markup regimens that 
are good at hierarchy and organization, but not so good at overlap.

When it is really extensive or very artful, this kind of thing 
becomes a certain sort of poetry working at the character level, such 
as the calligrammes of Apollinaire, which are similarly intractable. 
But then, those weren't designed for machine processing, so why should they be.


Wendell Piez                            mailto:[log in to unmask]
Mulberry Technologies, Inc.      
17 West Jefferson Street                    Direct Phone: 301/315-9635
Suite 207                                          Phone: 301/315-9631
Rockville, MD  20850                                 Fax: 301/315-8285
   Mulberry Technologies: A Consultancy Specializing in SGML and XML