Paul Grosso wrote:
>ESIS was developed for, and is part of, the Conformance Testing
>standard which has been an ANSI standard for several years and became
>ISO 13673 last year. Its purpose in life is *not* to provide a
>normalized form of SGML or any sort of parser output that would be
>necessarily useful for further SGML-based applications; it was to
>provide a form of parser output intended to allow conformance testing,
>and all conforming parsers are required to be able to emit this
>specific form of output so that they can be tested for conformance.
>SO ESIS does have formal status, and its use is "canonical" precisely
>insofar as parsers must emit it, but no one but a conformance testing
>application should expect to make productive use of it.
Yes, but this mandated role is not the only one that has been promulgated
of the applicability of ESIS -- mostly because 8879 itself lacks a proper
model. I'm not complaining about the existence of ESIS, just its common use
as the application/parser interface. This use of ESIS does have explicit
sanction in attachment 1 to "Recommendations for a possible revision of ISO
8879 (ISO/IEC JTC1/SC18/WG8/N1035)," which says "The set of information
that is acted upon by implementations of strcuture-controlled applications
is called the ... ESIS. ESIS is implicit in ISO8879, ... The purpose of
this paper is to define it explicitly." (SGML Handbook p.588). The other
categories of application have no defined model of what information they
might get from a parser, thus making vendor-independent processing outside
the bounds of ESIS problematic, at least. It may not be an ISO requirement,
but it is an influential position.
>DSSSL is very new--the final IS doesn't yet exist--and relatively
>complex even considering only simpler subsets. There are no implementations
>yet, public or otherwise--how could one expect otherwise?
I didn't mean to gripe about the current lack of implementations, just
wanted to plug the superior quality work in DSSSL, but make sure that it's
clear that you can't just "go get DSSSL" and solve your problems (yet!).
It's great that we're getting a good model of what an SGML parser
produces. And DSSSL looks to be well enough defined to be implemented.
>It is interesting to note that the task of tools such as SGML-aware editors
>(such as the tools available from SoftQuad, ArborText, and others) must
>solve exactly the problems you are considering.
> It is also not well-defined just what
>it means to normalize an SGML file in this way. If you like what a given
>tool (e.g., SGML Editor) does by way of "normalization," I suggest you
>make use of it. Otherwise, if you have a relatively simple definition
>of what normalization means to you and you like to program and you have
>an SGML parser, you can try to write something; but you should realize
>that it isn't a necessarily well-defined, simple task.
It certainly isn't -- It's the root of many problems with automatically
processing SGML, and a pain in the neck. The funny thing about
normalization is that it's a term that gets used and understood without
much trouble, except that the more you know about SGML, the less clear cut
it is (because of all the funny cases that come up).
I'm not sure if more of this is relevant to the list -- I just want to
be sure that no-one thinks I'm attacking DSSSL. Or SGML, for that matter --
though I don't want to whitewash some unfortunate problems for SGML->SGML
translation tools. Fortunately, you can hack your way around some of these
problems, especially if you don't have to solve them in general. Hopefully
the DSSSL work will bear fruit in resolving some of this.