Print

Print


> One issue which I think is really important is getting a little lost 
> in  the discussion here: whatever solution we choose, it must be  
> formally-structured enough so that any application can find out about 
>  any other application.

I'm not convinced. We could perhaps distinguish two goals:

- inventoriate all the pieces of information an application may need 
and elaborate a standard way of representing this information;
- providing to applications a general mechanism they can use to define 
the information they need and how they want the user to express it.

The goal of creating an inventory of the applications needs, and an 
standardized way for expressing them, seems to me far too complex. It 
redefine the goal of the TEI, which was (if I understand correctly) to 
identify the feature of a text, not the goals and needs of all 
application dealing with it :

    The goals of the TEI project initially had a dual focus:    being 
concerned with both what textual features should be    encoded (i.e. 
made explicit) in an electronic text, and    how that encoding should 
be represented for loss-free,    platform-independent, interchange.    
<http://www.tei-c.org.uk/Vault/SC/J31/WHAT.htm>

Furthermore, I don't see why it is important that an application can 
understand the parameters used by another application. By definition, 
an application is a black box -- at least from the point of view of 
another application :-). Information common to several applications 
should be, by definition, in the standard existing TEI markup, which is 
precisly define for this very usage: express in a machine readable form 
information not specific to an application. On the other hand, a 
mechanism for expressing application' parameters should be precisly 
used, in my point of view, for expressing thinks that the application 
is the only concerned with.

> There are indeed lots of approaches which could  be taken by any 
> application author using existing tags and attributes,  but there is 
> no clear expectation that this information should be  recorded in a 
> particular way -- nothing recommended in the guidelines,  and no 
> established practice we can draw on. The TEI often works in this  
> way, which is why we hear complaints that automatic parsing systems 
> can  never find information they need reliably in the TEI header -- 
> there's  no guarantee which of a multitude of possible approaches 
> might have been  taken in any particular case.

> When we come to authoring applications, this flexibility is problematic;

yes, this is perfectly clear and often pointed out. But this 
flexibility is also the very feature of the TEI.

There is an alternative between:

- redefine entirely the header (and the whole TEI) to express more 
formaly and predictabily (?) this information (which is both useless 
and impossible in my point of view), or
- allow application to use their own strategy, *in a standardised way*, 
for identifying information in the header.

I think the standardisation can only be at the level of the vocabulary 
used for expressing parameters, not in the parameters themself.

This allows fine coupling between application and existing markup. For 
instance the "last save" information may be stored in the existing way 
(an item in revisionDesc), but with an IDREF toward a registered 
application on the "change" element created/used by this application.

> my application does need to be able to find information about other  
> applications which have touched the file,

So, it may be the reponsibility of your application to know the usage 
of other application, not the responsability of the TEI to provide you 
this information.

> predictable way. If all application authors create their own feature  
> structures, it's going to be difficult to decode the information they 
>  record.

You will have to have an individual knowledge of application. I'm not 
sure it is less difficult for the TEI to deal with the burden of 
creating a robust, scalable, generalised, effective and standardised 
vocabulary for application.

> to this particular approach; what I AM insistent about, though, is 
> that  whatever approach is chosen, it will only be useful if it's 
> absolutely  predictable.

Yes, but there is two possible meanings: make the encoding of your 
application predictable to your application, and make the encoding of 
every application predictable to every application.

> It may be that all we need is a new section of the  guidelines, 
> laying down, in a fairly rigid form, what authoring  applications are 
> expected to do when recording information about  themselves in the 
> header,

But applications may potentially do absolutly everything and need 
absolutely all information provided in human-oriented fashion in the 
document, since it define the content of the document. So, should we 
redefine all the information contains in the TEI vocabulary in a more 
formal way?

> My app "declares an interest" in two or three elements in the file  
> (identified by xml:id).

Other application may be coupled with all the numerous occurrence of a 
given element, so the pointer mechanism will be pointless.

> variability in its encoding structures. Authoring apps that emerge 
> for  TEI, where they are not simply XML editors such as oXygen, are 
> most  likely to be specific to particular types of content or areas 
> of the  file -- a teiHeader editor, an image tool, or a bibliography 
> program.
> More than one tool may be used to work on any particular file,

I agree with all this view.

> and there really has to be a mechanism by which they can avoid 
> treading on each other's toes,

I thinks it's difficult to make the encoding perfectly predictable at 
the same time for every application possibly editing the document.

> So while it's true to say that applications can already make use of  
> existing elements to store a lot of this information, that's not the  
> point; we need a common system that all applications can depend on.

The point is to be "common" or not to be "common" I think.

-- 
Sylvain Loiseau
[log in to unmask]
http://www.limsi.fr/~sloiseau

On peut pratiquer objectivement, c'est-à-dire impartialement,
une recherche dont l'objet ne peut être conçu et construit
sans rapport à une qualification positive et négative, dont
l'objet n'est donc pas tant un fait qu'une valeur.

Canguilhem, /Le normal et le pathologique/, p. 157

----------------------------------------------------------------
Ce message a ete envoye par IMP, grace a l'Universite Paris 10 Nanterre