Michael Beddow writes:
> scholarly lexicography. There, what's encoded in the Word file (or the
> typesetter tape) is often the result of countless editorial iterations
> and repeated expert checking. To reykey (or OCR and postedit) is in
> effect to put all that scholarly work at risk, or at best require it to
> be done all over again.
Sure. I think we'd all agree that the bigger or more complex the original,
the more effort can reasonably be expended in clever
intention-guessing software. I am cautioning that the level at which
you say "oh the hell with it, send it to the Phillippines and rekey
for $2 a page" occurs earlier that some people believe.
> including by outside experts). Jon's ideas seem to me to be an exciting
> way of integrating the conversion process with the actual practice of
> scholarly lexicography.
for a big project, yes, I entirely applaud the ideas. Way back when, I
used to work for the Lexicon of Greek Personal Names typing in
material from hand-written cards. Being an impatient sort of person, I
started to write programs to check my work, and then moved onto other
peoples; the quantity of errors revealed was immense. I am sure a lot
of people can relate similar stories..