Dear Peter and Fabio,
Thanks for your replies.
@Fabio: I definitely want to have a free-standing *TEI* header, and I
don't think I would want to use the DC namespace even if that container
that you mention were ready -- not for this info alone.
I was just looking for a way to consistently mark the information about
the media type of the object that the free-standing header "is about". I
found the <extent> idea pretty sneaky, and far from ideal. It could be
more palatable, if I were to write a header for an electronic version of
the text "as such", and then I would, in the <extent>, say that, for
example, the PDF file takes up this much space, whereas the TXT version
takes up some different amount.
But I miss the way to state, somehow, and consistently, that "this
header here records the formal metadata of that file over there, and
moreover that file's media type is "xxx/yyy".
Now, to reply to Peter's first point:
> I think you could as well define the pdf to be the source of your TEI
> document, couldn’t you?
But could I? The way I understand the architecture in this case is that
I'm writing about the electronic version, the source of which is a
printed book that got scanned. So I would expect <sourceDesc> to specify
the info on the printed book, while everything else to tell the story
about what the electronic version is, how it came about, who's financed
it, etc. (And, possibly, about what it's media type is, because it will
not always be readable from the filename extension). I could be getting
it wrong though.
Your second point might be helpful to me. I didn't consider using
<facsimile>, because I have always considered the act of enabling that
element as a kind of commitment to provide cross-element linking,
between the transcribed text, and the facsimile.
But if it could be accepted as a standard practice, for free-standing
headers of binary objects (errm, but for plain-text object as well, to
be consistent??) that they use <facsimile> only to point to the
described object, I'd be happy to consider that.
I'm eager to hear opinions on this idea.
Thank you both, again!
On 19/02/15 13:56, Peter Stadler wrote:
> Hi Piotr,
> I think you could as well define the pdf to be the source of your TEI document, couldn’t you?
> <ab><media url="" mimeType=""></media></ab>
> <media url="" mimeType=""/>></>
>> Am 18.02.2015 um 22:20 schrieb Piotr Bański <[log in to unmask]>:
>> Hello everyone,
>> I would like to be able to express, within the fileDesc, the same information as <dc:format>application/pdf</dc:format> expresses.
>> When you prepare a header for an electronic version of a book that is not XML-encoded (in my case it's PDF -- the header will be free-standing), and you would like to specify the media type of the book file, how do you go about that, please?
>> Should I just use a <ref> or a <ptr> somewhere within the publicationStmt (or notesStmt?), so that I can sneak @mimeType into it? But if I do, it is going to be different from a direct statement that I would like to express in a machine-readable form -- hence, among others, in a well-defined position in the header. Am I missing something obvious, please?
>> One hack I've been thinking of is to use <media> under (or next to) <measure> under <extent>.
>> So, for example:
>> <measure unit="MiB" quantity="1.4"/>
>> <media url="my_book.pdf" mimeType="application/pdf"/>
>> But this solution has a kludgy tinge to it. And I think it places the rather first-order information about the location of the file and about it's media type in an almost accidental place in the header.
>> Thanks in advance for any hints you care to offer.