Print

Print


Hi Daniel,

As part of our work on Buddhist texts we develop a couple of glossaries in
various languages (Chinese, English, Sanskrit, Pali, Tibetan). We keep the
source in TEI P5 making use mostly of the dictionary module.
From there we output into different versions: stardict, pdf, ePub and
sometimes a free plugin for a proprietary  translation software called
DrEye, which is very popular in Taiwan and China.
For a table of available glossaries and formats see here:
http://dev.ddbc.edu.tw/~ray/glossaries/

No use to bookmark that. We are in the process of reworking the website so
soon the will be moved to the stable site:
http://buddhistinformatics.ddbc.edu.tw/glossaries/

So far the structure of the glossaries we maintain are fairly simple, but we
are working on a number of  more (much more) complicated ones these days
with lots of missing characters, difficult diacritics, different types of
underlines, various types of cross references etc. I believe TEI Dict is
pretty much the only game in town when it comes to this level of
sophistication. I do not know any other standard where you could express
this kind of data structure within a semantic frame-work.

Designing the markup for these glossaries, however, I was again reminded of
how much the TEI Dict module is oriented to digitization of a print
original. This reflects of course the earliest concerns of TEI, still I
can't say I am very happy with it.
In the end it is very useful as basic low-level format for all kinds of
reference works. I feel it is too loose and gives too many - functionally
equivalent - possible structures to be useful beyond digital archives. The
scripting into different output formats, formats that people do actually use
on their devices, is always almost a process of "dumbing down" the
information contained in the TEI. (<cit> becomes an indented paragraph,
<xr>s become links etc.)
Also it always involves different scripts for different glossaries, because
dictionaries vary so much in structure and content. All this makes TEI Dict
a reasonable archival or source format, but is of limited use for the"open
electronic publishing standard" that you are looking for, if I understand
that term right.

all the best

marcus


-- 
============================
Dr. Marcus Bingenheimer 馬德偉
Dharma Drum Buddhist College  法鼓佛教研修學院 (DDBC)
No. 2-6 Xishihu, Jinshan  20842, Taipei County, Taiwan, R.O.C.
台灣,20842台北縣金山鄉西勢湖2-6號  Tel:  +886-2-2498-0707#2227
http://buddhistinformatics.ddbc.edu.tw/~mb/