Hi list,
I'm just learning about TEI at the moment, and plan to use it in some
capacity to encode some Ancient Greek texts.
One of the things I'd like to do ideally is use stand-off markup to keep
the TEI XML separate from the UTF-8 text itself. This isn't currently
entirely possible, as the text would need at least some <ab> tags in
order to be addressable using xinclude [0] [1].
There is currently an early draft (or pre-draft, I suppose) of XInclude
1.1 [2], which seeks to include RFC-5147 [3]. This RFC specifies a
reasonable method to specify sections of a plain text file as a URI.
This would make stand-off markup of a plain UTF-8 file completely
do-able, in a standard, simple way, so one could include text with
something like this:
<xi:include href="page10.txt" parse="text" textpointer="char=5,20" />
However at the moment XInclude 1.1 is not even a draft, and certainly no
XML library has yet added its functionality. What I'm wondering is
whether once XInclude 1.1 becomes a standard, TEI will automatically be
compatible with it. I'm considering writing a little XML pre-processor
which just does the appropriate RFC-5147 includes for now, until
xinclude 1.1 becomes available and supported in XML libraries. But I
want to be sure that if I add XML in an XInclude 1.1 style, it will be
forward-compatible and magically work once this happens.
Many thanks,
Nick White
0: http://www.aclweb.org/anthology/W09-3011
1:
http://www.balisage.net/Proceedings/vol5/html/Banski01/BalisageVol5-Bans
ki01.html
2: http://www.w3.org/TR/xinclude-11-requirements/
3: http://www.ietf.org/rfc/rfc5147.txt
|