I agree with Erik Naggum about the impracticality of the mnemonic naming
proposed by Keld Simonsen. While I believe Keld's intentions were good,
I also think that his working model is untenable, i.e., that one can derive
a set of useful mnemonics (2 character in length) for each character in
the union of all character sets.
As Erik points out, the nature of Keld's proposed notation limits the
collection of mnemonics to a number far less than the number of elements
in 10646; furthermore, the mnemonic value is quickly lost, and, indeed,
is completely irrelevant in the case of Han characters [How should one
choose a mnemonic for a Han character? Should its meaning be used? Or
its pronunciation? In either case, both meaning and pronunciation differ
across the different uses of the same character among different writing
systems, e.g., Chinese, Japanese, Korean, & Vietnamese.]
On the other hand, if one were to use the full names of 10646, a file may
be quite unwieldy in its size due to the enormous expansion required to
convert non-ISO646 character references to entity names.
Personally, I think folks should be thinking about concrete syntaxes whose
baseset is ISO10646, rather building systems based on the reference concrete
syntax. Of course these two concrete syntaxes are isomorphic by means of
entity referencing. But we should really be building full 10646 syntaxes.
Document transfer can easily be accomplished by means of appropriate
P.S. Erik exaggerates when he says that "ISO 10646 contains all (all!) known
characters in the universe." There are many characters which are known but
are not yet encoded in 10646; they will be there eventually, but we're not