Just to echo Stuart's comment below: this approach is always the one I favoured - it's a cross between Martin's options 1 and 2. 

It is like option 1 in that the TEI markup is refactored to split such lists into two, but unlike it in that the refactoring is not a manual process but automated. It's like option 2 in that it uses XSLT to automatically handle the case of lists with embedded formework and page breaks, but unlike it in that the special handling is performed entirely in TEI, independently of any transformation to HTML. This is done using a "pipeline" in which multiple XSLT transforms are chained together, with the output of one step feeding into the input of the next. To handle page breaks in list, you would use a transformation for which the input is TEI with lists containing page breaks, and the output is TEI in which any such lists have been split. So this is a preliminary, pre-processing step in which general TEI is converted into much more constrained TEI. This more constrained TEI is then fed into a stylesheet which converts it to HTML. Because the input to the HTML stylesheet is much simpler TEI (without list//pb), the HTML stylesheet can be much simpler to write, understand, and debug.



From: TEI (Text Encoding Initiative) public discussion list [[log in to unmask]] on behalf of stuart yeates [[log in to unmask]]
Sent: Friday, 19 August 2011 6:07 AM
To: [log in to unmask]
Subject: Re: Handling lists in XHTML transforms

On 19/08/11 05:59, Martin Holmes wrote:

> 1. Refactor the XML markup to break the list into two, with some kind of
> connecting mechanism to handle numbering correctly in the case of an
> ordered list. I don't like this because I don't like markup to be driven
> by output constraints.

At the NZETC we have have an XSL in out TEI to HTML pipeline that
essentially does this at transformation time. It floats <pb>s up to an
appropriate level and splits list etc.

This avoids the need to for markup to be driven by output constraints
and enables you to change you policy around when / how to split things.

Note that in some cases the NZETC also wilfully generates bad HTML,
knowing that browsers can handle it. The best example I can think of is
our table handling in ePubs contravenes the standard.

Stuart Yeates
Library Technology Services