Print

Print


Quick, perhaps incomplete response --

Piotr,

I often run afoul of the fact that most strings are valid as a URI.
E.g., all of the following are valid xsd:anyURIs, as far as I know:
 http://www.example.com/
 htp://www.example.com/
 htttp:/www.example.com/
 why:is:this_bloody_thing_a_(valid)_URI?

If I really want a test string to fail against xsd:anURI, I put in
'%' signs without hex digits following. E.g., "a%bad%URI" should
fail.

But inded, for this reason, you may very well want to compare against
a regex than use `castable as xsd:anyURI`.


> One more little thing, about the xsd:anyURI check: I can't get it
> to work, and the problem appears to only concern xsd:anyURI,
> whether I use (1) to go via the TEI layer or (2) directly.
> 
> (1) <dataRef key="teidata.pointer"/>
> 
> (2) <dataRef name="anyURI"/>
> 
> If I do, e.g.
> 
> <sch:assert test="@my_attribute castable as xsd:anyURI">,
> 
> I can put anything into the attribute value, and it won't get flagged by 
> Schematron. I have done these checks with other data types, directly or 
> indirectly, and they all worked (so it's not that I use wrong syntax or 
> anything of that sort), but the behaviour of anyURI is different.
> 
> Is there some insider info anyone would care to share on this, please?
> 
> I looked for other examples of that within TEI/P5 and in Stylesheets, 
> but wasn't able to find any.
> 
> While composing this message, I found the following passage in the W3C 
> spec on XSD datatypes, and I hope I interpret it wrongly when I think 
> that this may be the issue:
> 
> "Because it is impractical for processors to check that a value is a 
> context-appropriate URI reference, this specification follows the lead 
> of [RFC 2396] (as amended by [RFC 2732]) in this matter: such rules and 
> restrictions are not part of type validity and are not checked by 
> ˇminimally conformingˇ processors. Thus in practice the above definition 
> imposes only very modest obligations on ˇminimally conformingˇ processors."
> 
> https://www.w3.org/TR/xmlschema-2/#anyURI
> 
> Would be correct to assume that we're looking at a "minimally 
> conforming" behaviour here? My checks were tested in oXygen, but also 
> during the TEI/P5 build process, which also validates with Schematron.
> 
> And a practical question: should I rather validate against a regex than 
> use `castable as xsd:anyURI`?