On 6/10/2011 12:46 PM, stuart yeates wrote:
> On 10/06/11 16:41, Brett Zamir wrote:
>> On 6/10/2011 6:04 AM, stuart yeates wrote:
>>> (b) You fail to comment on whether the HTML-only tools and sites (the
>>> first and second of your advantages) are going to be able to respect
>>> the semantics of the TEI-in-HTML5.
>> Yes, as long as they support Microdata. Currently, in the case of
>> Mediawiki, I have a bug report for Mediawiki to whitelist such markup,
>> and I think it is likely to be accepted by them, as with any other HTML
>> tools, since properly allowing the semantic information provides no
>> security risk, unlike allowing say arbitrary XML.
> Excellent, so they'll enforce the content models of the tags? I was
> not aware of this.
Sorry I misunderstood your meaning. The semantics are preservable, but
no, they will not support validation of the content models unless they
are specifically designed to do so (though as far as syntax, such HTML
tools may already enforce creation of well-formed XHTML and thus XML, or
on the other hand, it opens up the possibility (for better and worse) of
using TEI with the less strict but now well documented and almost
standardized HTML parsing algorithm).
Part of the reason I think X/HTML is appealing here is that it lets one
take advantage of tools or sites which might not have the ability or
wish to build in such custom validation support.
Given the Wild West nature of wikis in particular, you can't really rely
too much at a given moment on anything there (at least the popular wikis
today), but besides solving this by building custom extensions for one's
own wiki which do enforce validation, one could also build interfaces
which access documents through the wikis' open APIs and then do one's
own validation checking---all without requiring sites like Wikipedia to
be expected to add custom support for semantic vocabularies like TEI.
It's less than ideal, I know, but better, in my opinion, than the
current lack of ability to use these sites for TEI at all.
>>> (c) You fail to comment on the ability of TEI-in-HTML5 to represent
>>> structures that can't be broken down into single hierarchical
>>> structure. For example marking up the dual meanings of
>> TEI-in-HTML5 is no different from TEI really. It is just more verbose.
>> One can use the non-hierarchical approaches detailed in
>> http://www.tei-c.org/release/doc/tei-p5-doc/en/html/NH.html . If those
>> methods are not adequate, TEI-in-HTML5 will not offer any solutions not
>> solved by TEI XML. And as with TEI XML, you have to determine how to
>> display that information, if at all.
> So our current tools (schemas, schematrons, schema-understanding
> editors, etc) will work with TEI-in-HTML5?
I should have spoken more precisely. Technically, the TEI community will
probably wish to serialize into XHTML5 (the XHTML serialization of
HTML5), not HTML5, but it is common to speak of this all as just "HTML5".
Although one's existing schemas, schema-aware editors, etc. will not
work out of the box even with the TEI-in-XHTML5 serialization, part of
the process of determining an TEI-in-XHTML5 approach would surely need
to include providing a stylesheet to convert losslessly to and from
TEI-in-XHTML5 so one could then take advantage of such schemas. While I
don't want to add anything to Sebastian's plate, Roma could also
conceivably be enhanced to produce TEI-in-XHTML5 schemas. It would also
be conceivable for conversion tools to be created which convert existing
schemas into the XHTML5 flavor (or vice versa).
And I hope, now that HTML5 is spelling out HTML processing rules in such
minute detail, that it may even become possible for arbitrary HTML to be
reliably converted to XML (as it already can be created from XML). With
the rules spelled out now in HTML5, using HTML over XHTML does not mean
it is any less precise than XML, but its rules for parsing, besides
being more complex for parsers, will be less obvious to authors, and
thus it continues to be compelling I think to use the XML (XHTML)