John W. Kennedy wrote:
> Joshua Hutchinson wrote:
>> On 3/6/07, Julia Flanders <[log in to unmask]> wrote:
>>> Since I don't know very much about Jon's project, it's hard for me to
>>> say at this point whether the semantic nuance he asks about is
>>> pointless, essential, or somewhere in between, but it's certainly an
>>> interesting area to explore.
>> I'd hazard that Jon's question was prompted (at least in large part)
>> due to conversations we've been having about Project Gutenberg's
>> efforts to switch to a TEI-based master encoding.
>> So, knowing that our "markup editors" will be volunteers coming from a
>> largely book-loving background and not a scholarly background (and
>> hence tend to think in terms of layout vs in terms of semantics), how
>> would you approach this type of issue?  ie, How strictly would you
>> like to see PG stick to "semantic markup only" philosophy?  Where is
>> the balance between ease of markup and good strict practices?
> If they're going to use TEI, then they should do it properly. Otherwise, 
> let 'em stick with HTML, and save both the temporary labor of converting 
> to TEI and the permanent hardware overhead of converting back again on 
> every use.

If only we knew what "doing it properly" meant....

(Incidentally, as Jon knows but evidently others don't, PG has been 
working on producing documents in TEI for many years: Frank Boumphrey 
produced a tutorial for PG volunteers a copy of which has been on the 
TEI website since 2001 : see

I think this discussion is exactly the one that every TEI project has to 
have when it starts considering how it will use the Guidelines: which of 
the many possible distinctions are going to be useful *for my needs*? 
Which can I be sure of making accurately and consistently in my source 

The reason that we have <hi> in the TEI scheme is  because there will 
always be cases where deciding which of the possible semantics applies 
is hard or impossible or just too expensive to do accurately and 
consistently. This applies to modern material  just as much as it does 
to early print. So use it in good health. It's up to you to decide which 
"semantic" decisions are useful to you. And, by the way, doesn't the 
same set of issues apply to "non-semantic" markup -- when we say "it's 
in Italic" we're usually confounding several different font variants.

I can't say how much I disagree with the "let em stick to HTML" comment 
without being rude, so I won't say anything!