I think your best approach would be a simple XSLT transformation. What
kind of output format do you want? What will you be using to display the
On 2018-03-03 07:23 AM, Ciarán Ó Duibhín wrote:
> May I repeat this request, hopefully more clearly.
> I would like to locate any program (preferably for Windows) for making
> indexes, word lists, or concordances from TEI text, and which will
> interpret the <c> tag in the following way, which I hope is in
> accordance with its description as "non-lexical character": the content
> of the <c> tag is to be dropped in extracting tokens, but is to be
> included in displaying segments of text.
> For example, the text "an b<c>h</c>ean" should yield tokens "an" and
> "bean", but should be displayed as "an bhean".