John W. Kennedy wrote:
> Joshua Hutchinson wrote:
>> On 3/6/07, Julia Flanders <[log in to unmask]> wrote:
>>> Since I don't know very much about Jon's project, it's hard for me to
>>> say at this point whether the semantic nuance he asks about is
>>> pointless, essential, or somewhere in between, but it's certainly an
>>> interesting area to explore.
>> I'd hazard that Jon's question was prompted (at least in large part)
>> due to conversations we've been having about Project Gutenberg's
>> efforts to switch to a TEI-based master encoding.
>> So, knowing that our "markup editors" will be volunteers coming from a
>> largely book-loving background and not a scholarly background (and
>> hence tend to think in terms of layout vs in terms of semantics), how
>> would you approach this type of issue? ie, How strictly would you
>> like to see PG stick to "semantic markup only" philosophy? Where is
>> the balance between ease of markup and good strict practices?
> If they're going to use TEI, then they should do it properly. Otherwise,
> let 'em stick with HTML, and save both the temporary labor of converting
> to TEI and the permanent hardware overhead of converting back again on
> every use.
If only we knew what "doing it properly" meant....
(Incidentally, as Jon knows but evidently others don't, PG has been
working on producing documents in TEI for many years: Frank Boumphrey
produced a tutorial for PG volunteers a copy of which has been on the
TEI website since 2001 : see http://gutenberg.hwg.org/teidtds.html)
I think this discussion is exactly the one that every TEI project has to
have when it starts considering how it will use the Guidelines: which of
the many possible distinctions are going to be useful *for my needs*?
Which can I be sure of making accurately and consistently in my source
The reason that we have <hi> in the TEI scheme is because there will
always be cases where deciding which of the possible semantics applies
is hard or impossible or just too expensive to do accurately and
consistently. This applies to modern material just as much as it does
to early print. So use it in good health. It's up to you to decide which
"semantic" decisions are useful to you. And, by the way, doesn't the
same set of issues apply to "non-semantic" markup -- when we say "it's
in Italic" we're usually confounding several different font variants.
I can't say how much I disagree with the "let em stick to HTML" comment
without being rude, so I won't say anything!