> > Back in January there was some discussion of various ways to indicate
> > nonfiling characters in <title>s. Now that I'm facing a similar
> > situation (having to sort lots of titles from TEI documents), I'd
> > like to suggest another method that may suit our purposes better.
> > Using a <seg> to indicate the nonfiling characters would interfere
> > with phrase searching for us, but I don't mind the other suggestion,
> > encoding the filing title explicitly as an alternate <title> element.
> > However, Is it too much of a stretch to consider a filing title as a
> > particular rendition, and therefore put the appropriate data in a
> > rend attribute?
> > <title rend="nfc:4">The homes of the New world</title>
> This is near by the old technique of MARC mentioned in the
> initial mail from Kevin Hawkins, isn't it?
> Does anyone have a suggestion for how to indicate nonfiling
> characters(as they're called in MARC) in TEI <title>s? For example,
> "The Wizardof Oz" should be filed under "w", not "t". In MARC, you
> encode a "4"with the title to indicate that the first four characters
> should beignored when alphabetizing.
> This was a solution quite elegant - in my eyes. But Kevin argues:
> "Naturally, a solution for TEI doesn't have to be based on
> The "naturally" I don't understand if I suppose meant as counted
> 'atomic entities'. (How else could be determined that the title
> "_Uuml_ber den Dilettantismus" is filing under _D_ilettantismus?)
Sorry to have dropped off the list for so long after bring up a few
questions in February. I was overwhelmed while relocating. Thanks to
everyone for keeping the discussions going in my absence.
When I said "Naturally, a solution for TEI doesn't have to be based on
counting characters", I was referring to the architecture of TEI compared
with MARC. MARC counts characters to determine where to begin filing. TEI
*could* do this, but doesn't have to. Something like:
<title><seg type="nonfiling">Uuml ber den </seg>Dilettantismus</title>
<title>Uuml ber den <seg type="filing">Dilettantismus</seg></title>
conveys the sort order just as well but by using nested tags.
Syd asked a while ago what solution I wound up using in the end. Since
sorting was really only a question of presentation to the user, I opted for
having our programmer set our interface to ignore "a", "an", and "the" in
sorting. So I didn't encode it at all!
Peter Gorman's solution isn't a bad one, but strictly speaking, I think how
titles are filed is a question of presentation of a group of titles, not of
any particular title.