Print

Print


In an article in comp.text.sgml <[log in to unmask]>,
Joe English ([log in to unmask]) wrote:
 
   The CDATA keyword means different things in different contexts.
 
   When used as the declared value of an attribute, it means that the
   attribute's value is plain character data and is not tokenized
   or further interpreted by the parser in any special way (contrast
   ID, IDREFS, ENTITY, et cetera.)  Note that attribute value literals
   are *always* parsed as replaceable character data, regardless of
   their declared value.
 
   When used as the declared content of an element type, it does
   something quite different.
 
But is interpretation of &foo; in a CDATA attribute literal _mandated_
by 8879 or is it optional for the application?
 
Our current problem concerns the editing/display of Hebrew fragments in
otherwise Latin-1 text. Using TEI as an example:
 
   <FOREIGN LANG="iw" REG="&aleph;">A</FOREIGN>
 
(REG is a local addition; the example is simplified for demonstration).
The A is given because the scheme of transliteration used by the
original editor whose work is being cited used an A, and we are
reproducing this verbatim for the ease of use of scholars.
 
There is a font available which has Latin-1 characters in the ASCII
positions, and Hebrew characters in the high-order (>127 decimal)
positions. It is therefore easy in a graphical editor (eg Author/Editor)
to attach a style to any occurrence of <FOREIGN LANG="iw"> to use such a
font, so the "A" content displays as an A.
 
In the local entity additions (or in the editor itself) it is possible
to equate &aleph; to an arbitrary character, say \345, which is the
location of the aleph glyph (how euphonious :-) in the font used.
 
If you then apply a suffix (A/E terminology=affix) which replicates the
value of REG on the screen (for example in parentheses after the element
content, but still within the domain of FOREIGN), should the "&aleph;"
string be interpreted as a character entity reference and thus display
the aleph character from position \345 in the font, or should it display
the string "&aleph;"?
 
(I hope that's clear :-)
 
///Peter
--
Peter Flynn        | [log in to unmask]  | ...persuade users that
spreading
Computer Centre    | +353 21 276871 x2609 | fonts across the page like
peanut
University College | +353 21 277194 (fax) | butter across hot toast is
not the
Cork, Ireland      | Opinions are my own. | route to typographic
excellence...