Print

Print


Dear members of the list:
 
I am writing in regard to developing a Writing System Declaration for
Indic scripts. It was suggested that I contact Dr. Harry Gaylord at the
University of Groningen, but the email address I was given is apparently
invalid ([log in to unmask]).
 
I have looked at the existing WSDs for Arabic, Coptic, etc. on the TEI
homepage, and have some idea of the structure and details of WSDs.
Obviously, I will do a bit more research on the topic before I proceed
with the project. I do, however, have one initial question: a colleague
expressed his interest in seeing CSX+ declared in the proposed WSD. CSX+
(Classical Sanskrit eXtended+) is an 8-bit encoding scheme for the
transliteration of Indic scripts. It is a common medium used by
Indologists for editing and exchanging electronic documents.
 
Now, in the WSDs I looked at, a unitary Unicode character code is given
for each letter defined. However, Unicode does not explicitly support
the CSX+ encoding. Therefore, in the Unicode encoding CSX+ characters
would be produced by linking two (or three, or even four) different
characters. For example, the character "r-underring-macron" would
consist of characters U0072+U0325+U0304 ("r" + "combining ring below" +
"combining macron").  Am I correct in assuming that the character would
be declared in the WSD using all three codes?
 
I have in my possession a very old draft of the "TEI Guidelines" (draft
2, August 1992) which I dug up off the TEI ftp-site. Due to the numeric
convention used for filenames containing TEI documents, I was unable to
locate a more recent version. Would any members please refer me to any
documents which pertain to the topic of WSDs?
 
Thank you for your time and assistance.
 
Regards,
Anshuman Pandey