Sebastian Rahtz wrote:
> Conal Tuohy wrote:
>> When a user searches a collection of TEI documents, and receives a list
>> of links (hits), what do those links point to?
> Shouldn't the user have a choice of this? It seems
> to me that the search engine and API should assume
> searching on anything, and returning any level of granularity;
> but that a particular interface may limit you to predefined
> routes to make it easier to use.
To a degree isn't this determined by the nature of the text. Let's say I have a
whole bunch of poems marked up and am searching those for a particular word. I
probably wouldn't want just that word returned, but the whole line, line group
or poem. In a lot of straightforward cases there seems that the previous three
usual levels above word-level are those that a user might normally want. So
line/linegroup/poem, sentence/paragraph/section, line/speech/scene, etc. But I
agree with Sebastian that in the best of all possible search engines you should
be able to say "search for 'love' in any line, but show me the poem/scene as
whole it comes from or have the option to say show me the linegroup/speech.
When dealing with a collection of texts that are homogeneous in structure, this
is easy. I think as Conal was hinting that in a collection of disparate
heterogeneous texts, this is more difficult.
If one were to store what the basic unit of granularity should be for a
particular text, where would one store this? Let's say I want all searches on
this text to return results of <div type="poem"> and in another <div
type="scene">, is there an intelligible way I can store this so my xslt can pick
it up on a document-by-document basis?
Dr James Cummings, Oxford Text Archive, University of Oxford
James dot Cummings at oucs dot ox dot ac dot uk