Ralph and all,
This is actually pretty simple to do in XSLT, *if* you have a processor
that supports an encoding like "US-ASCII" for serialization. The serializer
then does the transcoding of Unicode into character references for you.
<xsl:output encoding="US-ASCII" indent="yes"/>
<!-- indent can be 'yes' or 'no' as you like, also doctype-system etc. -->
The trick is finding a processor that supports this encoding for
serialization. This is not uncommon among Java transformers: I've run the
above successfully in Xalan. Saxon run in Java also does it, I think, but
Instant Saxon does not, because it runs in the Windows VM, which doesn't
know the "US-ASCII" target encoding.
You also wrote:
>I'm missing something (I don't have any background in
>programming, remember), but I thought that if you typed something
>in a Unicode font on a Unicode-compliant operating system, the
>machine would save it as Unicode. Or am I just being naive?
Mm, naive. Since the lower end of Unicode is congruent with ASCII etc.,
editors sometimes fail to preserve it. The BOM that would serve as an
unambiguous "signature" for UTF-8 is optional, apparently. Not only that,
but since some browsers (e.g. NN 4~) choke on the BOM in UTF-8, some
editors leave it off, relying on heuristics (that can fail) to determine
whether something is supposed to be UTF-8, ISO-8859 etc. So things are ...
I hope that helps (or doesn't make it worse),
At 06:29 AM 4/12/2002, Sebastian wrote:
> > now I can display the text in the browser. Nevertheless I think I
> > would still like to have the ability to transform it, in the interests of
> > portability, since not everything will work with Unicode, particularly
>I'd guess that a very simple Perl program,
>which Michael will write for you (:-})
>is the simplest answer. I think XSLT is way over the top for this case.
Wendell Piez mailto:[log in to unmask]
Mulberry Technologies, Inc. http://www.mulberrytech.com
17 West Jefferson Street Direct Phone: 301/315-9635
Suite 207 Phone: 301/315-9631
Rockville, MD 20850 Fax: 301/315-8285
Mulberry Technologies: A Consultancy Specializing in SGML and XML