Print

Print


----------------------------Original message----------------------------
>    Date: Fri, 2 Oct 92 17:17:26 +0100
>    From: Keld J|rn Simonsen <[log in to unmask]>
>
>    It is a misunderstanding that my scheme is only 2 characters
>    about 500 characters in RFC 1345 have short identifiers with
>    3 ore more characters in them. About 24.000 Chinese characters are
>    also defined, and all of these are 5 characters long...
>
> Thanks for the clarification.  If this is the case, then why don't
> you simply use the ISO10646 names.  This would eliminate having a
> redundant name collection which can only cause confusion and allow
> errors to creep in.  Other than for efficiency reasons (i.e., the
> length of the character name), there doesn't seem to be any
> justification for having another set of names.  And, if storage
> efficiency is the only possible justification, I'm sure there are
> better ways to accomplish this, e.g., using 10646 as the BASESET
> in the concrete syntax.
>
> Glenn
 
There are several reasons for the design, here are a few:
 
1. For readability. The 10646 names are too long to be useful for
   humans reading text, while my notation is at least more
   adequate. For example my name:
 
     Keld J<LATIN SMALL LETTER O WITH STROKE>rn Simonsen
 
   vs:
 
     Keld J<o/>rn Simonsen
 
   It is all a matter of taste, but I find the latter readable,
   it does not disturb my rhythm of reading too much. Reading
   the 10646 names would fill my brain with LETTER and LATIN
   and STROKE, which is really not that relevant.
 
2. for writabililty:
 
   If I cannot generate a character directly from the keyboard,
   I can use a kind of compose character sequence to input it.
   The 10646 name is then very long and very error-prone to
   input, the above example is 33 characters, which would lead
   to many times of mistyping, while the two-letter combination
   is much easier to type, and (at least for this example) more
   easy to remember.
 
3. For presentation:
 
   The character set tables in RFC 1345 can be presented
   in about 100 pages, while a equivalent presentation using
   the 10646 names would be about a factor 10 larger.
   Thus short names save trees, are more manageable in publication etc.
 
Keld