Your basic point is a good one. Things are "watered down" because there
was no agreement on what values would be good for these semi-open categories,
and there was a feeling that either an attempt should be made to use closed
vocabularies, or leave it up to the encoder. There are also some theoretical
issues that come into things. Some dictionaries will use traditional
parts of speech like noun/verb/etc., but others will be new dictionaries
designed for use with automatic parsing systems, where there may not even be
a finite set of possible categories (in unification grammars, for instance,
grammatical categories are feature structures). The TEI in general, due to
its wide range of application, cannot afford to be prescriptive in the ways
that a preparer of new data would like.
From a commercial standpoint, the ticket is probably to use the DTD
extension capabilities, and TEIROLE to make a more restrictive DTD that will
enforce a particular standard. These modified DTDs are the kind of things that
a trade association might distribute. The TEI has in fact allowed documents
with a wide range of differing markup styles to be TEI-conformant. Thus one can
create a more restrictive application of TEI for the benefits you name, and
still have the compatibility with TEI as a bonus.
It's just seems that the flexibility the TEI requires for describing
existing documents is compatible with the kinds of prescriptive restriction
one might want for data creation.
-- David Durand