> Also thanks Benct, I'll try your script. I only need to adjust it so
> that $V includes āēīō and the diphthongs (which I've collapsed to one
> letter).

Make sure Perl sees āēīō as one letter each, too.  It's pretty good
about Unicode things if you use the encoding and utf8 pragmas as Benct

perl -le 'print length("āēīō")'                       # 8.  Oops!
perl -Mutf8 -le 'print length("āēīō")'             # 4. better.
echo "āēīō" | perl -Mutf8 -lne 'print length'   # 8 again.  utf8 only
helps with literals

This works:
echo "āēīō" | perl -Mutf8 -Mencoding=utf8  -lne 'print length'

It's important because character classes - that is, things like
[aeiou] and [^aeiou] - only match one character.  If you have to deal
with possible multi-character sequences, the positive match is still
straightforward - you use something like (?:a|e|i|o|u) - but the
second one becomes a harder problem.  Of course, defining a consonant
as "everything that's not a vowel" is lazy,  and not necessarily in
the good Perl way.  It's better to make an explicit list of the
consonants, which you probably have to do anyway in order to divide
them up into whatever subcategories determine the proper
syllabification of words like "adrisu".

Mark J. Reed <[log in to unmask]>