Have been too busy to follow all the discussions closely, but...

The nice thing with something like Microdata (or RDFa) with HTML is that since you are committing to at least use some ambiguous structural markup like div or span, the browser and all tools like CKEditor knows how to render it, it can be styled (including being hidden), etc., but it makes no commitments to semantics until you give it an @itemtype namespace.

Some tools might still strip out this semantic information (like some wikis, discussion forums, etc.), since safe sites white-list allowable HTML elements and attributes while they may not yet be unaware that these new HTML5 attributes are harmless to allow and no one has gotten around to submitting a bug report to enable them), but with an increasing awareness of Microdata, tools should be increasingly enabling their use virtually anywhere on the web, especially for more well-supported projects.

(The only reason I can think of for sites not allowing them is for fear of semantic data spam if people started adding hidden semantic code, hoping that their search-engine detectable phrases would show up in someone's searches, though at this stage too few people would be using this to make it likely, I would think.)

With this approach, there is no question about how the processor should handle an unrecognized namespace: it is completely ignored since it is known to be a semantic namespace rather than an application-based one. The only way it is used would be by client-side tools or harvesters which exposed it to searches, etc., and such tools could be made to interest the widest development support because they could be designed to be namespace-agnostic. If you want to search for the TEI namespace on Google, go ahead and do that. Or likewise with your own custom namespace.

Best wishes,

On 8/24/2011 6:22 AM, Wendell Piez wrote:
[log in to unmask]" type="cite">Dear Sebastian,

On 8/23/2011 5:23 PM, Sebastian Rahtz wrote:
It's just that others, I hear, want
some confidence, on finding a TEI document in the wild, that they can
drop it into a TEI system and get decent results.

It all comes down to the meaning of "in the wild", I suppose. If you pick
a random document from my filestore whose name ends in .xml,
and which starts<TEI, then yes, all bets are off. If you click on the
"download 80%TEI version" on my website, or visit http://example.com/texts/156/tei,
then it's up to me to make sure that what I expose is interchangeable. I
may also make available http://example.com/texts/156/teifull, of course.

But we agree (golly, is that the first time?) on this.

I believe we have agreed in the past. But I'm not sure we could agree on when we have agreed, even if we could agree on having agreed at all.

What bothers me about interchanging mixed namespace TEI XML
is the tacit assumption if I meet<XXX xmlns="http://example.com/wendell/NS"><ZZZ>x</ZZZ><YYY>y</YYY>z</XXX>
I can follow a reliable  procedure to ignore your markup which I don't grok.
But what _is_ the algorithm there?

You have a point that mixed-namespace development on a small core would not solve the interchange problem in the general case. It might, however, address it for the core, while exposing it for the rest.

Currently, the capability for interchange is not only dependent on extra development cycles, but also (a) the work you have to perform is not trivial, (b) it is over and above the work you are already (over-) committed to doing just to design and support your application, and (c) there is often no good way even to detect where there is a need for it other than inspection and fine-toothed validation against sets of constraints that by definition are not shared by the community, or they wouldn't be at issue.

The fact that that the work could be done, or at least facilitated, by the publisher (who has a better understanding of local semantics), only makes this worse when the publisher doesn't do it, which is the usual case, since the onus is generally on the receiver to make it happen.

It's quite true that if you accept data marked up in my namespace, you are then at my mercy to document and perhaps help process the markup in a way I expect it to be processed. But presumably the TEI Consortium could refuse me the use of namespace "http://tei-c.org/ns/experimental/wendell2011a" until I demonstrated the utility and suitability of my nifty new tag set for others (maybe meeting a two-implementation requirement?), documented it to its standards with worked examples, and offered a dumbing-down transformation into 80% TEI to accommodate anyone who didn't want to support the tagging natively.


Wendell Piez                            mailto:[log in to unmask]
Mulberry Technologies, Inc.                http://www.mulberrytech.com
17 West Jefferson Street                    Direct Phone: 301/315-9635
Suite 207                                          Phone: 301/315-9631
Rockville, MD  20850                                 Fax: 301/315-8285
  Mulberry Technologies: A Consultancy Specializing in SGML and XML