Print

Print


On Fri, Apr 24, 2009 at 4:07 PM, Andrew Jarrette <[log in to unmask]> wrote:
> --- In [log in to unmask], "Mark J. Reed" <markjreed@...> wrote:
>>
>> And what operating system and web browser are you using?
> Windows 2000 and Mozilla Firefox.  I'm still looking to find out where I saw the strange characters.


Is it possible it was in a quoted reply from someone else?   We do
have some subscribers who still have trouble with such characters, so
even though they may go out of your computer fine and come back fine,
they can get mangled in someone else's quotation of your text.

> I believe I use ASCII codes?  I'm really rather deficient in terminology and technological
>  know-how

The deficiency is on the part of the technologists who haven't managed
to make this stuff transparent to the foolks who don't have the
know-how yet.  It's like having to be a mechanic just to start up your
car and drive it to the grocery store.

Anyway, ASCII (the American Standard Code for Information Interchange)
is the common subset of characters that works just about everywhere:
plain letters with no diacritical marks, some punctuation (including
standalone diacritical marks that were intended to be used as
overstrikes back when everything was printed on paper), etc;
basically, the characters you can type on the keyboard with bare keys
and shifted keys, with no alt key.  There are only 128 characters in
ASCII, and 25% of those are taken up by control codes, most of which
aren't used much in plain text.

The accented characters, characters in other alphabets, etc, come from
a much larger character repertoire called Unicode, which has room for
over a million different characters, not nearly filled, with
characters from all over the world and the distant past as well as the
present. We want everything to be in Unicode all the time, and we're
getting there.

But we're not there yet.  In between ASCII and Unicode there were a
bunch of other character sets, each designed for a particular region
or purpose, each only double the size of ASCII.  The problem is
there's no way to tell just by looking at the data what character set
it's supposed to be in.  So now we have text flying around in all
sorts of different encodings, being converted to and from Unicode and
plain-ASCII encodings of Unicode, and it's very complicated and easy
for the software (or the programmer) to get wrong.

Someday.

-- 
Mark J. Reed <[log in to unmask]>