It is no help to hear "it works for us" but, it works for us. All the
time in fact and as late as end of the week before last.
We apply styles throughout before sending it to Oxgarage. Sometimes we
have problems with something (I think maybe weird characters or
something) but in recent months that is less so.
If we run into problems we try--saving it in the same format using a
different processor (e.g. loading it into OO and then saving it as
.doc), or loading it in a different processor and saving it in that
native format (i.e. open in OO and save in OO). I don't think we've ever
been totally defeated, and mostly it works very well.
There had been trouble early on with font style changes (i.e. italics or
bold on) producing a lot of noise in the XML. But recently that's been
excellent too. When it was a problem, we found using character styles
instead of font properties worked.
We'll be using it again this week, so perhaps we can see if our
experience replicates yours.
On 12-06-25 10:10 AM, Martin Holmes wrote:
> Hi Michelle,
> I think the documentation for the docx conversion is here:
> and I think you'll find XSLT files here:
> On 12-06-25 07:48 AM, Michelle Dalmau wrote:
>> Hi all,
>> My intern, Amanda, has been experimenting with the Word=>TEI
>> conversion module that's part of OxGarage with little success. We have
>> a couple of encoding projects that rely on Word documents as the
>> source documents (one more complicated by the use of "Track Changes"),
>> and I thought perhaps it was high time to explore options other than
>> "copy and paste" to facilitate their encoding workflow.
>> At this point, I am not sure of other troubleshooting/trial and error
>> mechanisms to explore so I appeal to you for your wisdom and help.
>> Here's the short version of what she's done so far:
>> 1. Converted a docx file straight up to TEI. Problems: headings and
>> lists were not recognized, empty comment tags appeared, random empty
>> 2. She revisited the Word docx, explicitly applied Word styles to
>> various structural elements. We were sure that in so doing, lists,
>> headings, etc. would transform properly. Only the headings transformed
>> properly. Still problems with all the rest.
>> Then Amanda took it to Open Office and used the TEI plugin converter.
>> Similar problems.
>> I dug around to see if I can access the underlying XSL(s), but I am
>> not sure where to go. I came here:
>> <http://www.tei-c.org/release/doc/tei-xsl-common/>, but the
>> documentation link for "default conversion from docx" is broken
>> I should say that these projects, aside from the track changes which I
>> imagine can be mapped to additions, deletions and notes (what we do
>> manually), are fairly simplistic in structure. Divisions, headings,
>> paragraphs, lists, tables and the rare figure.
>> We are also operating under limited technical expertise (harken back
>> to Julia's original post on 2/28/2011,
>> so bear with us.
>> Word Up,
>> | Michelle Dalmau, Digital Projects & Usability Librarian
>> | Indiana University Digital Library Program
>> | Herman B Wells Library
>> | 1320 East 10th Street, W501
>> | Bloomington, Indiana 47405
>> | (812) 855-1261, [log in to unmask]