Print

Print


Hi there,

This discussion has dropped off the radar, and I'd like to raise it 
again because I have to make decisions for my application fairly soon.

It appears using <creation> is wrong, for the reasons explained by Lou 
Burnard; it also appears that there are at least two types of 
"authoring" application, typified by my Image Markup Tool and by the 
EPT, which have different relationships to the original material they're 
marking up, but which should both be able to use the same version 
information structures to describe themselves in the file. Having looked 
again at using <respStmt>, I think it's too unstructured to do the job 
well; at the very least, it would need its description modified to 
remove the reference to responsibility for "intellectual content", and 
the schema would need to be modified to allow the use of <ident> inside 
it. If we're going to modify schemas, we might as well try and do a 
proper job, and provide a useful set of tags for encoding this information.

I'd like to propose more formally what I threw together as an example 
last week:

<creatorApp>
    <appIdent key="appId">ImageMarkupTool1</ident>
    <appIdent key="appName">Image Markup Tool</ident>
    <appIdent key="appVersion">1.0.3.5</appIdent>
    <appIdent key="appURI">http://..../</appIdent>
    <appIdent key="userDefined" userKey="licence">Mozilla Public Licence
1.1</appIdent>
    <date value="2006-05-25T11:03:55">Last save: 2006-05-25 at
11:03:55</date>
</creatorApp>

This would involve only two new tags, <creatorApp> and <appIdent>, and 
two attributes, "key" (an enumeration) and "userKey" (a string). Lou's 
suggestion that it belongs in encodingDesc is surely correct. I've 
included below the more detailed explanation from my message last week. 
Before I submit this as a feature request on the sourceforge site, I'd 
like a little more feedback (especially from other people involved with 
creating TEI tools). Any comments?


[previous explanation]

It might be best to begin by setting out the array of information we may
want to record with these tags. I've already listed three items:

	appName (human-usable application title)

	appURI (i.e. where you could go to get the application, or info about it)

	appVersion (conventional dot-separated Major, Minor, Release and Build
integers)

I think this may also be very useful:

	appId:

This would be an identifier for the application which conforms to xml:id
constraints. Such ids are used by application developers, for example to
create a mutex when an application is running in Windows, so that an
installer or uninstaller can check to see whether the application is
already running before attempting to install it or uninstall it.

We can imagine lots of others, relating to licensing (GPL, Moz 1.1,
etc.), registered user, programmer or publisher, but these are less
important and more difficult to constrain to a fixed format.

I prefer "app" to "product" as a prefix; "product" has a rather nasty
marketing-speak flavour about it to me, although that's a personal thing.

There is an analogue for all this in the Windows file versioning
information system (and presumably on other platforms too). Any Windows
executable can contain a data structure which encodes a range of
name-value pairs, many of which are formally documented and expected
(major version, minor version, build, release, etc.) and some of which
are conventional, but it also allows you to add your own name-value
pairs and use standard API calls to retrieve them. If we're considering
creating a new structure or set of tags, I think it would be a good idea
to follow a similar model, with some items we'd expect the application
to add to the file, but with the flexibility for other information to be
encoded in a standard way. Something like this is what I imagine:

<creatorApp>
    <appIdent key="appId">ImageMarkupTool1</ident>
    <appIdent key="appName">Image Markup Tool</ident>
    <appIdent key="appVersion">1.0.3.5</appIdent>
    <appIdent key="appURI">http://..../</appIdent>
    <appIdent key="userDefined" userKey="licence">Mozilla Public Licence
1.1</appIdent>
    <date value="2006-05-25T11:03:55">Last save: 2006-05-25 at
11:03:55</date>
</creatorApp>

With a system like this, getting all the available info about an
application is simply a question of iterating through the appIdent tags.
specific info can be retrieved using the key attribute (or whatever
would be a better name for it). The key attribute could be constrained
to an enumeration which also includes "userDefined", and that could be
used in combination with an unconstrained userKey attribute to add any
info the developer thinks is required.

The date tag is useful in this scenario:

We can envisage multiple applications working on a file in sequence, and
it would be politic not to destroy the creatorApp information about
previous applications. For instance, imagine that someone creates a
really powerful and intuitive tool for building teiHeaders, and that's
all it does. You might mark up an image using the Image Markup Tool
(which lets you edit the teiHeader as text, but doesn't help you with
it), and then use the teiHeader tool to add an elaborate header. You
could then edit the file again in the Image Markup Tool (which wouldn't
damage or change any elements of the teiHeader other than those which
directly concern it). It would be important to know which of these
applications was used to edit the file, and when. So each tool, assuming
they're both using this proposed versioning system, would be able to
identify its own creatorApp tag (using the appId) and modify that every
time it edited the file, while leaving the other application's
creatorApp tag alone.

Does this make sense?

I really don't want to customize my schemas to add support for this
information; it really is useless if it can't be found in a predictable
location in a reliable format, because the whole idea is that it be read
and written automatically by authoring programs.

[end of previous comments]

Cheers,
Martin


-- 
Martin Holmes
University of Victoria Humanities Computing and Media Centre
([log in to unmask])
Half-Baked Software, Inc.
([log in to unmask])
[log in to unmask]