On Wednesday, February 20, 2002 10:03 PM
Rafal T. Prinke wrote:
> I have been using TEI-emacs for some time (and Kevin Russell's
> Ebenezer's suite before that) and the greatest problem I have
> is its Unicode support. While it displays a range of rare
> (but important) characters correctly - eg. long-s or planetary
> symbols), it seems to replace other (equally or even more important)
> characters with spaces (eg. aelig or thorn). Is there any remedy
> for this?
I'm sure there is, but there is an anti-Unicode culture among some of the
people with the greatest expertise in matters of encoding and character
representation on Unix platforms (and therefore on uniix apps ported to
Win32). A particularly depressing example is at
where the account given of Unicode in effect stops with Unicode 2. It fails
to reflect how far the Unicode commitees have learned from the "Han
unification" controversies and suchlike and have now taken most of the
necessary steps to address the issues which, according to this influential
piece by an undoubted authority, can only be met by (mis)using Mule to
juggle encodings and character sets on the fly, something which is at odds
with cardinal principles of XML.
> Some time ago someone observed here (sorry - I don't remember
> who it was)
Well, as long as the message got across...
> that while many editors claim to fully support XML
> specs, in reality most of them do not support Unicode which
> is the most fundamental aspect of the specs.
When a software house announced a new (commercial) Win32 XML editor on the
XSLT list, I pointed out that in the small print it said its "support
for non ASCII characters" meant support for ISO-8859-1 only. I received an
abusive reply from the authors, which in effect threatened to sue me if my
"defamation" of their product affected their sales. Well, that product seems
to have sunk without trace, and I await their writ with some interest...
> I have tested a number of programs (demos or freeware) on
> Win95/98 and it is only XED that gives reasonably full Unicode support.
The obstacles that Win9x puts in the way of consistent Unicode handling are
enormous, and software authors can be forgiven for not taking them on. (The
authors of XED did their valuable job by enlisting the unix-bred
capabilities of Python/Tk to make it run on Win32 platforms). What is less
forgivable is when software houses plead the need for backwards
compatability with Win9x as an excuse for not providing the full Unicode
support which NT in all its incarnations, and now XP, make pretty easy. And
now that even vb.net at last allows those nice screen forms to be drawn and
used without enforcing repeated transcoding between the system codepage and
the internal (UTF-16) representation, there really is no excuse for Win32
applications not to support Unicode completely, provided serious users are
prepared to dump Win9x/ME in the garbage receptacles where it truly belongs.
But in the commercial sector, informed customer pressure is needed, along
with a clear message that Win9x compatability is not a requirement of the
specialist encoding market. I was recently dismayed to discover that the
very latest version of the terminological database package most widely used
by professional translators, whose release had been delayed for a couple of
years to achieve "full XML support", still does all its keyboard and screen
I/O via transcoding to and from the default codepage of the local system,
presumably in the interests of Win9x compatability. So it still doesn't
allow, for example, the embedding of a Japanese term within a Norwegian
sentence. Indeed, to handle Japanese at all, it is necessary to set the
system locale to JP and live with all those Yen signs in system paths.
And this in a product which translation companies cheerfully pay
thousands of dollars to licence. Send your utf-8 encoded XML to such a
company for translation, and it may emerge from their "fully XML compatible"
software bearing the scars of multiple, and completely unnecessary,
> XMetaL and XMLSpy state that they do support it on NT but
> not on 95/98 (and I have not tested them on NT).
XMLSpy in particular is designed from the ground up to exploit the full
Unicode capabilites of NT. Using it under Win9x eliminates much of its
power. I imagine it's mainly for marketing reasons that Altnova don't make
that as clear as they might in their advertising and docs.
Michael Beddow http://www.mbeddow.net/
XML and the Humanities page: http://xml.lexilog.org.uk/
The Anglo-Norman Dictionary http://anglo-norman.net/