Is there a recommended way to link to an external taxonomy or controlled
vocabulary in an unambiguous, machine-interpretable way?
I am hoping to document a practice which could be used by different
encoders, which would identify their classification systems reliably, so
that when these documents are collated, an automatic process will be
able to identify common classification systems used.
For instance, consider the follow example from
<bibl>Library of Congress Subject Headings</bibl>
<bibl>Library of Congress Classification</bibl>
These bibliographic references are certainly unambiguous (to me at
least), but my concern is how to encode them (and others like them) in a
way which will be meaningful even to a simple XSLT script or similar.
These names "Library of Congress Subject Headings" and "Library of
Congress Classification" could be specified as elements in a controlled
vocabulary (of controlled vocabularies!) in a set of encoding
guidelines. But people using other languages might like to refer to
these schemes by other names. IMHO a far better technique (for software
agents) would be to identify the external schemes by a (persistent!)
URI, such as http://purl.org/dc/terms/LCSH or
http://purl.org/dc/terms/LCC which are URIs published for this purpose
(i.e. as identifiers) by the Dublic Core Metadata Initiative. Even if
there existed a few suitable URIs for each scheme, this could still be
more manageable than using textual strings.
How should this be encoded in TEI?
<ref target="http://purl.org/dc/terms/LCSH">Library of Congress
<ref target="http://purl.org/dc/terms/LCC">Library of Congress
I am interested in other suggestions, and also in whether anyone else
sees a value in this - should there be an official guideline
recommending it? It seems to me that a growth in text encoding and in
interoperability would make something like this very useful.
[log in to unmask]
New Zealand Electronic Text Centre