> I'll ruffle some feathers when I say this (though perhaps not on _this_
> list), but HTML is to my mind really just a style sheet. If you can
Actually, it's just the opposite: there are only four visual-markup
elements in the HTML DTD. I think what you are implying is that
RFC1866 has little in the way of formal (container) structuring, and
that's quite true. (It took a major struggle to get what little there
is in there anyway :-)
> 1. There is no single TEI to HTML mapping. Some people will
> want their paragraphs rendered as an empty line, and will map the TEI <p>
> to an HTML <p>; others will want a carriage return followed by a tabbed
> indent, and will resort to some kludge like <br><pre>\t</pre>.
They _might_, if they haven't read the HTML3 proposals :-)
> 2. It is generally a waste of time (read: processing overhead)
> to employ SGML conversion tools / languages like Omnimark for on-the-fly
> conversions. Most people I know use some sort of pattern-matching
Quite right...the processing overhead is way too great for on-the-fly
conversion. For my $0.02, disk space is cheap but memory and cycles
are expensive, so if it's whole files you deal with, convert them and
store them, unless your files really do change every day...
> I think that Virginia and Michigan both use Perl scripts, sauf
> erreur. If you really want to optimize for speed, I would suggest Lex.
> That's what I use, and in my initial trials, there was a noticeable
> improvement in performance.
Even a sed script will shift some lead if you need to munge the output
of a dynamic routine like a search system.