Print

Print


Right, now we've settled choice's hash, time to cast and interpret a few
runes....

I think the basic thing to hold on to here is that <c> is meant to contain
one single abstract character, or something (generally an NCR or a CER) that
resolves to a single abstract character. How abstract characters map on to
phonemes is specific to the writing system and language concerned. In a
Latin-style script, the contents of a <c> may signify only one component of
a phoneme, in a syllabary it will often (though not always) correspond to a
phoneme, in an ideographic script that one abstract character will often
represent a sequence of phonemes.

I can write the Tagalog word for eye(s) in four Latin abstract characters as
mata, or (if I'm designing some sort of folksy CD cover) in two Tagalog
abstract characters as U170B U1706. The Tagalog syllabary count matches the
phoneme count, but the reason why the character or <c> count for the
syllabary version is two has no essential link with the number of
represented phonemes: there are always two phonemes in the word, no matter
how many abstract characters my chosen writing system uses to represent
them. So if I wanted to use <c> elements, I would need four of them for the
form mata and only two for the historical script version.

Michael Beddow