I made a series of test files, on http://nl.ijs.si/tei/convert/
The manual pb doesn't show up in navodila.docx.xml which was made with
Stylesheets/docxtotei --profile=transcription navodila.docx navodila.docx.xml
Maybe you used only the section one?
navodila.rtf.* were made by the http://nl.ijs.si/e-zrc/rtf2tei/ service and "navodila" are instructions in how to use it, with examples,
so it's quite good for debugging - the text is in Slovene though. This service is quite old (but maintaned) and I'd like to extdend it now, and also switch to the TEI Stylesheets, so I'm sure to have more comments - and would also be glad to partake in some development!
From: TEI (Text Encoding Initiative) public discussion list [mailto:[log in to unmask]] On Behalf Of Sebastian Rahtz
Sent: Wednesday, April 03, 2013 11:15 PM
To: [log in to unmask]
Subject: Re: [TEI-L] Word to TEI: capturing page breaks?
On 3 Apr 2013, at 18:20, Kevin Hawkins <[log in to unmask]> wrote:
> There's the kind that forces a break at that point, which you insert manually. This has various subtypes: some start a new "section" (important for the size of margins etc.), whereas others continue in the same section.
> But then there's also the kind of breaks that just happen to occur because that's where the text flows. If you edit the text, it reflows, and the break occurs at a different point.
for what its worth, my docx to TEI conversion now supports both foms of page break; but the auto-generated ones are
ignored by default. The conversion can be turned on with a parameter "preserveSoftPageBreaks"
wot larks eh Pip.
PS I also discovered I had not put in support for transforming column breaks in Word. now done.
Director (Research) of Academic IT
University of Oxford IT Services
13 Banbury Road, Oxford OX2 6NN. Phone +44 1865 283431