I have a small set of historical documents marked up in an XML schema of my own design. The
reason is to be able to extract 'biographical details' of historical people from the documents, to
assist in research (eg. all paragraphs that refer to 'Mr. J. Smith'). I built an XSLT that searches
through the XML documents and outputs for each different personal name mentioned the
biblographic details of each document in which they appear and the document element(s)
containing their name. I then use this output to direct additional research in order to resolve, if
possible, ambiguous references to individuals with the same name.
Recently I've been thinking about using TEI (probably TEI Lite) rather than my own custom schema.
What I'm not sure about is how to handle data tables in TEI, of which there are many in the
historical documents (eg. a list of land holders and their holdings). Pre-TEI I would have used a
markup such as:
<name>Mr J. Smith</name>
In TEI the document this becomes a row marked up using <table> elements and sub-elements.
<table rows="2" cols="4">
<row role = "data>
<cell><name>Mr J. Smith</name></cell>
If I am restricted to <table> many of the cells in data rows have no explicit markup - you have to
locate the label row in order to find out what type of data a cell is supposed to contain. I know
how to do this thanks to the support staff at Oxygen.
My question: Can I add markup to the <table> so that I can query/transform it as <table> or as
<landholder> from the one TEI-encoded document?
From my recent reading on the topic it seems feature structures and <ana> could be part of the
solution, but not sure how to do this and still adhere to the TEI standard. I want to keep to
standard TEI because it has so many benefits but at the same time be able to query the document
tables as if they were similar to database tables.
I'm relatively new to XML and TEI. I hope my question makes sense.
Perth, Western Australia