On Tue, 2005-09-27 at 16:57, Wendell Piez wrote:
> Dear Peter,
> At 03:56 AM 9/27/2005, you wrote:
> > > I'm a bit mystified because what you say above suggests you have set
> > >
> > > <xsl:strip-space elements="*"/>
> > >
> > > which is not necessary.
> >*I* don't, but I've seen a lot of users trying it that way.
> Ah, so this is an argument for catering to ignorance? (It's okay if
> it is; sometimes this is necessary.)
Not really so much catering to ignorance, as providing an unexpected and
unpleasant surprise behaviour which conflicts with expectation, and thus
causes what the user would regard as unnecessary extra work. One of my
charges used the phrase "why couldn't they have made it do it the right
> I think this is a tradeoff. Sometimes you want the tool just to "do
> the right thing". Sometimes you want more of an informed choice by
> the user. The latter, especially when "the right thing" has grey
> boundary lines (or is grey altogether). I'm not convinced I or we
> would be happier with a new schema dependency in XSLT processing.
> (I've stopped defaulting attributes entirely as I've shifted to more
> powerful, flexible and expressive schema regimens, such as RelaxNG+Schematron.)
Definitely, but we have to handle a lot of legacy, and a lot of that
simply isn't going to change in the foreseeable future.
> Again, depends on where you sit. Most of "my users" (I know lots of
> users at all levels from all sectors) don't have any basis for
> comparison to SGML, but I can sure see they're benefiting from, e.g.,
> the huge range of XML tools that SGML never had. There are many
> reasons for the existence of this toolkit; but I think XML's cleaner
> layering between instance and schema is one of them.
Yes, that and the vast majority of other goodies. This particular
problem is small in the global scheme of things, but when dealing with
very large quantities of extremely dense markup in mixed content, it's a
pain in the ass.
> >Then the transformation language should provide the facility to do it
> >right. XSLT does not at the moment provide this, AFAIK, because it
> >removes those white-space nodes in mixed content which should only be
> >compressed to a single space instead. It is precisely the transform
> >which is modifying the data, not the schema or the parse.
> Could you specify your requirement for munging again? Could it be
> done as a function munging text nodes?
With strip-space turned on globally (the case in point), the
white-space-only text nodes between elements don't make it through to
the XSLT, so you cannot address them :-)
> I've got whitespace cleanup stylesheets to help me when I really need
> them. But I haven't taken the time to try to generalize them. A
> generalized solution might want to take more than one pass, if it's
> to use a really simple algorithm.
Like many such problems, it's not a difficulty if you are in a position
to specify the list of element types for strip-space. My objection is to
the *default* behaviour for "*". Defaults should cater for the most
commonly-occurring circumstances, and not break the document model by
predication on borderline cases and special parameters. And strip-space
can only take Names in its elements attribute: it would be nice if you
could use XPath :-)
Web browsers have an implicit version of strip-space in their default
handling of HTML. Imagine if the space between two adjacent elements in
mixed content suddenly disappeared.
> It sounds like an Extreme paper, Peter.
"Why I demand white-space" or "Strip-space considered harmful" :-)
> And all of this is thankfully beside the original question, the right
> answer to which depends on much more than whether XSLT happens to be
> "good enough" in this respect.
Certainly good enough. Just unfortunate that this circumstance was not
more obvious at the time: part of the problem seems to be the CS-derived
insistence on treating the document as a tree, even when it's not.