On Fri, Jan 23, 2004 at 09:38:49AM -0800, Gary Shannon wrote:
> But more to the point, where do I go to find out how
> those XSAMPA things are pronounced.  If followed a lot
> of Google links, but all I got was things like "close
> front unrounded" which means about as much to me as
> "top green polyester".  Is there a site that actually
> explains the difference between close front unrounded
> and top green polyester?

There are several explanations of the terminology online,
but there's also a site (whose URL I can't find at the moment)
which has audio files of the various symbols being pronounced.

In principle, the vowel terminology is straightforward.
"Rounded", for instance, means that your lips are rounded, as
they are for saying "ooh" (X-SAMPA [u]).  Thus, in French class,
English-speaking students are taught that to say a French <u>
(X-SAMPA [y]), they round their lips for "ooh" but inside their
mouth try to say "eee" (IPA [i]) instead.   The only difference
between [y] and [i] is that the former has the lips rounded, the latter
doesn't - and this is true for the other vowels on the chart.

The terms "open/close" and "front/back" refer to the position of the
tongue and jaw when pronouncing the vowel, which is harder to
get a feel for.  Basically, the tongue is used to close off part of your
mouth and create a resonating cavity that is smaller than the entire mouth,
and it is the shape of this cavity which determines which vowel is heard.
"Front" vowels have the tongue pushing forward in the mouth so the
resonating cavity doesn't extend very far toward the throat, while "back"
vowels use most of the front-back space in the mouth.  "Open/close" is
the vertical direction; "close" vowels have the tongue pushing upward
toward the roof of the mouth, so the resonating cavity is high in the
mouth, while "open" vowels have the tongue lower (and the jaw open
wider, usually), so more of the mouth's height is used.

The vowels around the edges of the IPA chart are "cardinal", which means
that they're at the extremes of their respective dimensions.  Most
dialects of most languages include non-cardinal vowels which are represented
by the symbol for the closest matching cardinal; when more precision is
required, there are diacritical marks that mean "more open", "more
fronted", etc.

Consonants are a little more complex.  Mostly we worry about the
"pulmonic" consonants at the top of the IPA chart; that just means
they're pronounced by expelling air from the lungs.   The rows refer to
how much the airway is blocked when the sound is produced.

"Plosives", also called "stops", stop the airflow completely while
being pronounced; English examples include b, p, d, t, g, and k.

"Nasals" have the airway through the mouth completely blocked, like a
stop, but air is allowed to escape through the nose;  as with vowels,
the shape of the mouth cavity distinguishes them from each other.
English approximants include m, n, and N (the "ng" sound in "sing").

"Trills" have the air shooting past either the uvula or the tip of the
tongue so fast that it vibrates against the mouth, rapidly closing and
opening the airway.

"Taps" and "flaps" are like trills that are stopped after a single
instance of the close-open cycle.

"Fricatives" obstruct the airflow partially, resulting in a
hissing-type sound; English examples include f, v, s, z, S (the
"sh" sound), Z ("s" in "measure"), T (the "th" in "path") and
D (the "th" in "the").

"Lateral fricatives" have the tongue blocking the airway vertically down
the center of the mouth while allowing air to escape on both sides of
the tongue, so the airstream is forked.   English doesn't have any of
these, but Welsh "ll" is an example.

"Approximants" have only a slightly narrowed airway, with no
interference of the fricatives type.  As with vowels and nasals, it is
the shape of the airway which distinguishes them.  English approximants
include h, l, r\ (the general "r" in most varieties of American
English), w, and j (the consonantal "y" sound).

There are also "affricates", which aren't on the chart because they're
composed of combinations of plosives and fricatives, run together so
that they're pronounced at the same time.  The English "ch" sound, for
instance, is t + S, while the English "j" sound is d + Z.  The English "x"
sound can be regarded as an affricate of k + s, but in most dialects the
two components are pronounced more distinctly; it's just k followed by s,
not k and s run together into a single sound.

The columns refer to where the constriction in the airway occurs.

"Bilabial" means that both lips are used; examples are p, b, and w.

"Labiodental" means that the teeth and lips are used together (usually
upper teeth and lower lip), as in English f and v.

"Dental" consonants like d, n, s, S use the tongue and the upper teeth;
there are further subdivisions based on whether the tongue touches the
actual teeth (true "dental"), the ridge just behind them
("alveolar"), or the roof of the mouth just behind that ridge
("postalveolar").  True "dental" may also be subdivided into plain
"dental", where the tongue touches the back of the teeth, and
"interdental", where the tongue is placed between the upper and lower
teeth (which is how T and D are usually pronounced).

"Retroflex" consonants are pronounced with the tip of the tongue curling
backward toward the throat so that the underside of the tongue touches
the roof of the mouth.  English doesn't have any of these.

"Palatal" consonants like j are pronounced with the body of the tongue flat
against the roof of the mouth (in the case of j, not all the way,
because it's only an approximant).

"Velar" consonants like k and g are pronounced by pushing the back of
the tongue against the back of the roof of the mouth, close to the

"Uvular" consonants are like velars only even further back, almost like

In "pharyngeal" and "glottal" consonants the closure is made in the throat
rather than the mouth, either at the vocal cords ("pharyngeal") or in
the throat itself ("glottal").

The distinction between pairs of consonants within the same cell of the
chart is between "voiced" and "non-voiced".  In voiced consonants like
b, d, v, g, z, the vocal cords vibrate when they're pronounced, like a
tiny hum, whereas in unvoiced consonants like p, t, f, k, and s, they

Sometimes pairs of distinctions go hand-in-hand; for instance, in English,
voiced consonants are more "lax" than their voiceless counterparts; the
muscles of the mouth are not as tense as they are for the voiceless
versions (which are in fact called "tense").  In fact, this difference
is more pronounced than the voiced/voiceless distinction in English,
which is why people can tell the difference between t and d even when
the speaker is whispering.  There are diacritics that represent this
explicitly, but since the voice distinction goes along with it, there's
no need to complicate the notation for English unless extreme precision
is required for a given application.