In message <[log in to unmask]>, Tim
Seid <[log in to unmask]> writes
>we are surprised at is the difficulty of finding a suitable search engine
>to browse, search, display the documents without paying a large sum up
>front and annual fees ($15,000 plus $5000 annual).
>We have been told in proposals everything from Google and PHPDig to
>packages like Alchemy and DT Search. None of these meet our criteria:
>- understand the XML documents (not just ignore the tags)
>- display search results with highlighting of some sort in file
>- run in Linux
>- basic functions of search engines
>Does anyone have suggestions of good, affordable search software?
An alternative approach, which we have used successfully for the Perdita
project (http://human.ntu.ac.uk/perdita/PERDITA.HTM), is to use XSLT to
generate a complete web site giving indexed access to the material.
Researchers can select material via one of a number of indexes
(currently genre, author name, repository, first lines of verse), each
generated from the relevant XML markup.
This approach has the advantage of being free (apart from any stylesheet
development costs), and of producing a straight HTML result which can be
mounted on CD for off-line publication. However it doesn't offer free
text searching, and it may not scale well for really large collections
(Perdita has a few hundred mss).
Alternatively, you can look into the use of Open Source XML databases
such as Xindice or eXist as the foundation for XML-aware search
functionality - but you would then have to build the rest of the
application yourselves on top of these engines.
SGML/XML and Museum Information Consultancy
[log in to unmask]