Print

Print


When a user searches a collection of TEI documents, and receives a list
of links (hits), what do those links point to? In some cases they should
point to (a representation of) the full TEI document. In other cases
they could point to a chapter (div) or sub-section from a document. In
general the desired granularity might depend on a number of factors.

I would be interested to hear how other implementors of search engines
have dealt with this issue; what are the factors which determine
granularity of searching? how are they reflected at the level of TEI
encoding?

I can think of several different possible approaches:

1) always index particular TEI elements (just full document, div1 or
div2 elements, group/text elements, etc)
2) choose a granularity appropriate to a particular text (e.g. by having
a set of predefined text-types, some of which are indexed to div1 level,
others div2, etc)
3) index at multiple levels simultaneously (so that a search may return
the same content by itself, as well as in a broader context, or several
broader and broader contexts).
4) index only elements which have some associated teiHeader metadata
(e.g. with @decls attributes)

Thanks!

Con
--
Conal Tuohy
Senior Programmer
+64-4-463-6844
+64-21-237-2498
[log in to unmask]
New Zealand Electronic Text Centre
www.nzetc.org