At 12:42 PM 1/4/2007, Martin Holmes wrote:
>I just think that customization which results in documents that
>don't validate against tei_all should be a last resort.
But isn't this rather different from saying such documents should not
be considered to be conformant TEI?
>My gut feeling is that people will rarely take the time to examine
>individual schemas and figure out how to process them; more likely,
>they'll examine well-defined and widely-used schemas (such as XHTML
>1.1 or Docbook, and hopefully P5 tei_all) and write processors that
>can handle documents that conform to those, and the rest will fall
>by the wayside unless they're very important for some reason. Anyone
>writing an automatic harvesting system to do a particular piece of
>research, build a metadata database or something similar is unlikely
>to care about my little collection of a couple of hundred documents
>enough to examine my schema.
Actually there is a large and thriving market in data conversion. The
simple requirement to get Word documents into some form of resuable
XML is a phenomenal market driver. And extracting metadata from
binary presentational formats into harvesters makes getting TEI in
look like an afternoon in the park -- despite how TEI metadata is
It's true that TEI projects are smaller and therefore have to work
harder to get their works into the mainstream of document processing.
But that's actually been the whole idea of TEI all along -- to make
it possible for scholars to use something better than Word for their
small, peculiar kinds of text encoding, addressing their small,
peculiar -- but very interesting, and occasionally quite important --
requirements. The beauty of TEI is precisely that for 90% of their
conversion, they can go with stuff that's very generic (downloaded
off tei-c.org), leaving them to concentrate on the interesting 10%.
>But this is futurology, based on the assumption that 640K of RAM is
>enough for anyone...
Not only that, but I don't believe it'd be fair to say "No one's
going to bother to learn Cherokee; Cherokee authors should write in
English". Where do you draw the line? Should we all learn to write
"Basic English" with its 850 words?
Wendell Piez mailto:[log in to unmask]
Mulberry Technologies, Inc. http://www.mulberrytech.com
17 West Jefferson Street Direct Phone: 301/315-9635
Suite 207 Phone: 301/315-9631
Rockville, MD 20850 Fax: 301/315-8285
Mulberry Technologies: A Consultancy Specializing in SGML and XML