Print

Print


Hi Daniel,

As part of our work on Buddhist texts we develop a couple of glossaries in various languages (Chinese, English, Sanskrit, Pali, Tibetan). We keep the source in TEI P5 making use mostly of the dictionary module.
From there we output into different versions: stardict, pdf, ePub and sometimes a free plugin for a proprietary  translation software called DrEye, which is very popular in Taiwan and China.
For a table of available glossaries and formats see here:
http://dev.ddbc.edu.tw/~ray/glossaries/

No use to bookmark that. We are in the process of reworking the website so soon the will be moved to the stable site:
http://buddhistinformatics.ddbc.edu.tw/glossaries/

So far the structure of the glossaries we maintain are fairly simple, but we are working on a number of  more (much more) complicated ones these days with lots of missing characters, difficult diacritics, different types of underlines, various types of cross references etc. I believe TEI Dict is pretty much the only game in town when it comes to this level of sophistication. I do not know any other standard where you could express this kind of data structure within a semantic frame-work.

Designing the markup for these glossaries, however, I was again reminded of how much the TEI Dict module is oriented to digitization of a print original. This reflects of course the earliest concerns of TEI, still I can't say I am very happy with it.
In the end it is very useful as basic low-level format for all kinds of reference works. I feel it is too loose and gives too many - functionally equivalent - possible structures to be useful beyond digital archives. The scripting into different output formats, formats that people do actually use on their devices, is always almost a process of "dumbing down" the information contained in the TEI. (<cit> becomes an indented paragraph, <xr>s become links etc.)
Also it always involves different scripts for different glossaries, because dictionaries vary so much in structure and content. All this makes TEI Dict a reasonable archival or source format, but is of limited use for the"open electronic publishing standard" that you are looking for, if I understand that term right.

all the best

marcus


--
============================
Dr. Marcus Bingenheimer 馬德偉
Dharma Drum Buddhist College  法鼓佛教研修學院 (DDBC)
No. 2-6 Xishihu, Jinshan  20842, Taipei County, Taiwan, R.O.C.
台灣,20842台北縣金山鄉西勢湖2-6號  Tel:  +886-2-2498-0707#2227
http://buddhistinformatics.ddbc.edu.tw/~mb/