Print

Print


Suppose you have a language composed of a discrete, finite set of syllables. I was considering the ideal way to construct vocabulary for that language. My idea was to divide all concepts into separate categories, one for each syllable. Then subcategories would be equally subdivided, and subsubcategories and so forth. To identify any word in this language, it would only be a search on an O(k * log(k)(n)) where log(k) is log base k.  That is, you have to know what each letter means, then you automatically narrow down the word lookup exponentially. It would be like as if every letter beginning with 'a' were all related somehow, in a way that all other words are not.

It sounds like a great strategy, but I've been having problems with the fact that many concepts we think up are very specific.  Horse for instance. It's a four legged ungulate equiid, an animal mammal that eats hay, carries people, has a large bottom, its coat is referred to as hide not fur, it has a mane referred to as hair, as in 'horsehair' etc etc etc.  Just to call a horse a living organism that's a animal chordate mammal ungulate equiid Equus equs alone would take 7 syllables.  How would I differentiate the horse from the zebra, from the weasel, from the sea squirt, if I tried to limit it to 4 syllables of specification?  That is, a 4-syllable word for living organism animal chordate, which is already pretty darn long compared to the 1 syllable 'horse'.

What I end up with is an extremely deep and sparse distribution, very frustrating because a lot of concepts like other non-horse members of genus Equus, do not even exist! Certainly they're not found in common conversation.  Should I just randomly determine vocabulary? It'd be an even spread, but it would be a lot harder to remember if xrbtsx is horse and xrblsx is desk lamp for instance.

I had one more idea: that instead of starting with general categories, I start with specific terms, then generalize.  So I could have 'to' mean horse, and 'tobu' be anything in Equus equs, and 'tobuba' be anything in the Equiid family, and so forth.  Trouble with that is, which specific concepts get to be the root of all language? Wouldn't they have to be generalized, by necessity?
-- 
Pandora "Starling/Tasci/Antinomy/Figment/???" synx
jabber: http://synx.us.to/jabber.png