On 16/02/14 03:06, Syd Bauman wrote:
> (I really like this example because it also demonstrates the
> short-coming in TEI encoding of end-of-line hyphens: is the word
> after "punish" supposed to be re-constituted as "evil-doers" or
> "evildoers"? Many encoders or projects may not wish to commit to one
> or the other, but if you do, TEI does not give you a standard way to
> differentiate. But I'm getting side-tracked ...)
Wouldn't you use <hyphenation eol="some"/> in the header if you decide
to remove the ones you're "commited to"?
As to the tokenization, @break does this for you (more or less)
evil-<lb break="yes"/>doers -> tokenize as evil- doers (or "evil doers"
if you decided to suppress the hyphen too)
evil-<lb break="no"/>doers -> tokenize as evil-doers
evil-<lb break="maybe"/>doers -> if you don't want to take up a
position at all.
Back in the main thread, however, yes I totally agree that the so-called
Popham Proposal (if he's really responsible for it) is a Very Bad Idea.
I suspect it may have been a consequence of a data model which didn't
permit milestone elements outside divs.
Note, however, that just saying "we always encode <note>s inline, even
when their content spans a page" is OK if your objective is "find out
what page I begin on", but less so if you want to do something like
"count all the words on this page, whether within notes or not"