For the purposes of processing, of course, a sequentially-ordered unique
identifier for each page break is useful - on several projects I've worked
on, this has been accomplished using @xml:id assigned a value created from a
four character identifier, an abbreviation for page break, and a sequential
number independent of anything printed/written on the page (e.g.,
xml:id="DIST-pb-0209"). When the numbering is regular, this alone would be
enough to generate labels using automated processes (XSLT with the xpath
substring functions, etc.). For leaves/pages with irregular numbering, one
could use @n or @rend to record the 'erroneous' label to be processed
differently (XSLT: <xsl:when test="@n">, etc.). In other words, only pages
with non-sequential or repeated numbering would need the extra attribute:

<pb xml:id="DIST-pb-0288"/>
<pb xml:id="DIST-pb-0289"/>
<pb xml:id="DIST-pb-0290" n="209"/>

I wouldn't think this is too unusual a situation nor always the result of
error - several kinds of books come to mind that might contain repeating
page labels (omnibus editions of multi-volume series, works with two parts
that are independently numbered, etc.).


On Fri, Jan 16, 2009 at 3:14 PM, Martin Holmes <[log in to unmask]> wrote:

> HI folks,
> We have a set of printed books in which page-numbering is frequently
> erratic; numbers are omitted and repeated, and sometimes the order of digits
> in the page number is wrong.
> I'd be glad to know how anyone else has handled marking up this problem.
> One approach we took initially for digit-ordering was to do this:
> <pb n="146" rend="164" />
> where what should have been 146 was printed as 164. But on another volume,
> I've found that page numbers 18 and 19 were repeated, meaning that
> everything subsequent to that is "wrong"; that's made me reconsider what I
> mean by "wrong" in this context, and whether a page number might just be
> better viewed as a label rather than a necessarily unique identifier for a
> page.
> Cheers,
> Martin
> --
> Martin Holmes
> University of Victoria Humanities Computing and Media Centre
> ([log in to unmask])
> Half-Baked Software, Inc.
> ([log in to unmask])
> [log in to unmask]