Hey, all. I'm looking to get my hands on sample XML output of a few
OCRed pages in various formats. The formats I know of are:
* ALTO 
* ABBYY Fine Reader 
* PAGE 
* hOCR 
but since I never do any significant OCR myself, I don't know which
ones are good vs bad or common vs rare, and thus if these are even
the ones I should be asking for.
SO, if you have relatively easy access to OCR software that produces
XML output, and would not mind sending me a sample, please get in
touch off list. Thank you!
P.S. Why? To work on providing crosswalks from the OCR XML formats
to the TEI in Libraries version 4.0 level 1 encoding. Thanks
 Had trouble finding informal definition, but schemas are at