Some weeks ago there was a question about SGML software. The response
did little more than list a dozen or so packages. Can anybody provide
a little more information than that? I've been reading the TEI guide-
lines and van Herwijnen's book, and have been getting enthused about
using SGML; I even have a project to work on. I want to purchase
some software but don't know how to choose what I need. In Chapter 10
van Herwijnen lists things to look for in an SGML editor and describes
a few systems, but none for MS-DOS, which I'm interested in. Can anyone
point to some reviews or make some general recommendations?
To describe more specifically what I'm interested in: A colleague and
I have begun capturing the text of a campus newspaper, primarily for
the purposes of full text searching. The text is coming from two sources,
ASCII files with embedded formatting codes, which we began collecting a
few months ago, and back issues scanned with a Kurzweil 4000.
I am primarily interested in processing the ready-made E-text, writing
simple conversion programs to convert what formatting I can into structural
tags, and then doing some manual touch-up. My colleague is is interested
in scanning, which of course requires a lot of manual work. As this project
evolves the textbase will migrate from platform to platform, so creating
a standard file right at the start makes a lot of sense. We are tentatively
starting with Folio Views on a PC; by the time we move up to something
bigger, maybe we will be using software that can read SGML directly (we
recently saw a demo of BASISplus which was mentioned on the list a while
ago--it does come with a utility that will load SGML documents, but I
don't know the particulars).
By the way, does anyone know where I can find a DTD for newspapers?
Kent State University Libraries Kent, Ohio 44242
216-672-3024 FAX 216-672-2265 [log in to unmask]