Print

Print


Dear List,
 
since we are low-traffic these days, I hope a slightly off-topic
question will be tolerated:
 
In a projected electronic edition, we want to present text and digitized
witness in an integrated way that allows navigating between the two on
the level of words, if possible.  That is, from a given word in the
electronic text, the user can get not only to an image of the page that
word is on, she will possibly be presented with the section of
the image that contains the corresponding word, the word itself being
highlighted.
 
The issue now is not so much the presentation tool, but a tool that
makes the entry of the necessary information (mapping between words and
picture coordinates) a more reasonable endeavour.
 
                               * * * * *
 
Can anyone give me a hint to a finished program or a program that goes
into the right direction?
 
(Open source that can be adapted to our project's needs would be ideal)
 
(Oh yes, and there is no big budget for this, unfortunately)
 
                               * * * * *
 
The simplest method would be opening an image in any image processing
program, obtaining coordinates by pointing with the mouse, and entering
them by hand into a table that already contains the words of the text.
But this will exhaust patience, time and money (for drudgery slaves) and
lead to poor results.
 
On the other extreme would be an autonomous program that works like OCR
software up to the point when the sections representing a word are
recognized, but then matches against the existing electronic text and
produces the finished tables.  This is, of course, science fiction, or
rather, humanities fiction.
 
But if you know something roughly in between, I'd be glad for any hint.
 
Thanks in advance,
 
                        Tobias
 
--
.............................................
       (_)                     Tobias Rischer
        "==='              [log in to unmask]
         " "
...still.loving.gnu..........................