At 6:44 PM 2/15/95, Gregory J. Murphy wrote:
>Before I start chipping away at yet another wheel...
>Has anyone out there written a utility that reads sgmls 1.1 output and
>puts it back into sgml? What I'm after is a way to easily "normalize"
>sgml - supply omitted tags and inherited attributes and the like.
In general this is impossible. *start SGML gripe* Sgmls gives only
information in the ESIS, which hides certain critical pieces of
information: non-system entity references are expanded out. Tags with
an EMPTY content model are represented as empty start/end pairs, without
annotation that they were empty. Including the end-tag would, of course,
not be legal SGML.
You'd also lose entity declarations in the DTD subset.
It's a shame that ESIS, which has no official status, is so canonical in
practice. The DSSSL document model is much better, but -- no public
implementations as far as I am aware....
*end gripe, start semi-productive workarounds*
On the bright side:
You can special-case EMPTY elements at the CoST of a special script for
each DTD (or set of TEI modifications) involving empty elements.
You can map all your text-entities foo to the string "&foo;" with
special entity declarations. (Or let SGML pass you the entity name as
implied, and slap delimiters around it. I don't remember the details on
this, so you'd have to check Goldfarb. I'm not even sure that "implied
entity value" is the correct term).
Copy the doc subset when you slap in your doctype declaration at the top of
You also lose non-significant whitespace (not so bad), and SGML comments
(maybe bad depending on your tagging practice).
>If not, would anyone be interested in having such a tool?
I use CoST when I need to do this. I may do a lighter weight sgmls/TCL port
that would run faster, but your Unix box may be speedier than the smallish
one that I've used for this task.
>The nice people at SoftQuad pointed out to me that I could obtain the
>results I desired by importing documents into Author/Editor, and then
>exporting them. But somehow I can't picture myself whiling away the
>midnight hours importing and then exporting text after text after text...
I don't know for sure, but I imagine that you could write a scheme script
in AE that would loop through your directories and convert files in batch
Other than CoST I'm not aware of any released PD tools that can do the job.
Anyone else have any ideas?