At 07:52 AM 9/26/2005, you wrote:
>So an application like XSLT has a switch to allow the preservation or
>removal of white-space. Unfortunately the distinction is binary: you
>can either keep it or lose it, and no distinction is made between
>white-space found in element content and that found in mixed or PCDATA
>content because *at the time of parsing it*, the parser may not know
>what further type of content awaits it within the current element.
>The result is that text nodes containing only white-space tokens are
>removed entirely when the strip-space switch is ON. My argument is that
>if at least one subelement has already been encountered in the current
>element, then white-space-only nodes should no longer be suppressed in
>this element, but collapsed to a single space token. This would still
>permit the suppression of leading white space nodes, which is almost
>always what you want, but it would defeat the suppression of trailing
>white-space nodes (because in mixed content a preceding element would
>have been encountered).
I'm a bit mystified because what you say above suggests you have set
which is not necessary.
If you have a schema (or even if you don't, but know what the schema
would tell you), it's not hard to say
<xsl:strip-space elements="TEI.2 body div"/>
where TEI.2, body and div are those elements in your schema defined
as having element-only content. This way you can safely dispose of
whitespace-only nodes that are there only for cosmetic reasons in the
code, while safely leaving in place any whitespace that might matter.
What's so hard about that?
Respectfully, while I understand why you want XML tools to do a
better job at what-was-once-intended-for-SGML tools, I don't think
any other suggestion gets anywhere close to the right balance. In
particular, I do not think it would be a net gain if we had another
area where a document processed with a schema gives different results
from the same document processed without a schema. If your schema has
the effect of modifying your data when it is processed, IMO it should
be a transform. ;->
Wendell Piez mailto:[log in to unmask]
Mulberry Technologies, Inc. http://www.mulberrytech.com
17 West Jefferson Street Direct Phone: 301/315-9635
Suite 207 Phone: 301/315-9631
Rockville, MD 20850 Fax: 301/315-8285
Mulberry Technologies: A Consultancy Specializing in SGML and XML