> ... we are not sure about how to encode or indicate the language of
> each work. We consider that a good place to do it can be the
> attribute id= [of] the TEXT element.
The lang= attribute of the TEXT element would be *much* better.
Besides being the place were this information is supposed to go,
this frees up id= for what it is for -- a unique identifier. Even
though your files are separate now, it is not at all unreasonable
to think that in the future you might want to assemble them all
together as one big corpus (e.g., TEICORPUS) file. If you have a
project-wide unique id= on each TEXT ahead of time, this will
facilitate the task. At the WWP each encoded text has a reference
number. The physical version in the file cabinet is referred to
by "OT" (for "office text") followec by the 5-digit reference
number; the corresponding TEI file has an id= of "TR" (for
"transcription") followed by the same 5-digit reference number.
This allows for an easy way to find the physical copy associated
with each file and vice-versa, too.
-- Syd Bauman, EMT-Paramedic
SGML & XML programmer / analyst
Brown University Women Writers Project
[log in to unmask] 401-863-3835