Print

Print


Thank you Lou and Pitor for your response.  Yes the second paragraph is supposed to be a translation although as you have noticed it is not perfect and some of the original text is included here too.  Regarding @type your right that’s a lot better as I can repeat this in other lemmatized and translated divs whereas I couldn’t do that with the @xml:id.  As some of the divs contain more than one sentence would it be better to stick with <p> rather than <s>?
Here is the same paragraph from the source data used…
<ComposedBlock ID="ZONE1-1" HPOS="522" VPOS="3999" HEIGHT="12834" WIDTH="3477">

            <TextBlock ID="PAG_1_TB2" STYLEREFS="TXT_3" HPOS="522" VPOS="3999" HEIGHT="12834" WIDTH="3477">
              <TextLine ID="PAG_1_TL3" STYLEREFS="TXT_3" HPOS="699" VPOS="3999" HEIGHT="171" WIDTH="3162">
                <String ID="PAG_1_ST4" STYLEREFS="TXT_3" HPOS="699" VPOS="4026" HEIGHT="144" WIDTH="612" WC="0.99" CONTENT="Dewiswyd"/>
                <SP ID="PAG_1_SP2" HPOS="1311" VPOS="4062" WIDTH="66"/>
                <String ID="PAG_1_ST5" STYLEREFS="TXT_3" HPOS="1377" VPOS="4062" HEIGHT="96" WIDTH="72" WC="1.00" CONTENT="y"/>
                <SP ID="PAG_1_SP3" HPOS="1449" VPOS="4029" WIDTH="78"/>
                <String ID="PAG_1_ST6" STYLEREFS="TXT_3" HPOS="1527" VPOS="4023" HEIGHT="111" WIDTH="399" WC="0.99" CONTENT="Parch."/>
                <SP ID="PAG_1_SP4" HPOS="1926" VPOS="4020" WIDTH="78"/>
                <String ID="PAG_1_ST7" STYLEREFS="TXT_3" HPOS="2004" VPOS="4017" HEIGHT="105" WIDTH="177" WC="0.99" CONTENT="W."/>
                <SP ID="PAG_1_SP5" HPOS="2181" VPOS="4020" WIDTH="78"/>
                <String ID="PAG_1_ST8" STYLEREFS="TXT_3" HPOS="2259" VPOS="4020" HEIGHT="102" WIDTH="111" WC="0.99" CONTENT="S."/>
                <SP ID="PAG_1_SP6" HPOS="2370" VPOS="4023" WIDTH="75"/>
                <String ID="PAG_1_ST9" STYLEREFS="TXT_3" HPOS="2445" VPOS="4017" HEIGHT="129" WIDTH="387" WC="0.99" CONTENT="Jones,"/>
                <SP ID="PAG_1_SP7" HPOS="2832" VPOS="4011" WIDTH="90"/>
                <String ID="PAG_1_ST10" STYLEREFS="TXT_3" HPOS="2922" VPOS="4008" HEIGHT="132" WIDTH="360" WC="1.00" CONTENT="M.A.,"/>
                <SP ID="PAG_1_SP8" HPOS="3282" VPOS="4041" WIDTH="75"/>
                <String ID="PAG_1_ST11" STYLEREFS="TXT_3" HPOS="3357" VPOS="4041" HEIGHT="93" WIDTH="156" WC="1.00" CONTENT="yn"/>
                <SP ID="PAG_1_SP9" HPOS="3513" VPOS="4008" WIDTH="84"/>
                <String ID="PAG_1_ST12" STYLEREFS="TXT_3" HPOS="3597" VPOS="4005" HEIGHT="126" WIDTH="264" WC="0.99" CONTENT="gad-" SUBS_TYPE="HypPart1" SUBS_CONTENT="gadeirydd"/>
                <HYP HPOS="4002" VPOS="4167" WIDTH="90" CONTENT="-"/>
              </TextLine>
              <TextLine ID="PAG_1_TL4" STYLEREFS="TXT_3" HPOS="552" VPOS="4152" HEIGHT="180" WIDTH="3315">
                <String ID="PAG_1_ST13" STYLEREFS="TXT_3" HPOS="552" VPOS="4182" HEIGHT="147" WIDTH="390" WC="0.99" CONTENT="eirydd" SUBS_TYPE="HypPart2" SUBS_CONTENT="gadeirydd"/>
                <SP ID="PAG_1_SP10" HPOS="939" VPOS="4197" WIDTH="48"/>
                <String ID="PAG_1_ST14" STYLEREFS="TXT_3" HPOS="987" VPOS="4191" HEIGHT="114" WIDTH="420" WC="0.99" CONTENT="Bwrdd"/>
                <SP ID="PAG_1_SP11" HPOS="1404" VPOS="4194" WIDTH="48"/>
                <String ID="PAG_1_ST15" STYLEREFS="TXT_3" HPOS="1452" VPOS="4191" HEIGHT="129" WIDTH="345" WC="0.99" CONTENT="Ysgol"/>
                <SP ID="PAG_1_SP12" HPOS="1794" VPOS="4188" WIDTH="54"/>
                <String ID="PAG_1_ST16" STYLEREFS="TXT_3" HPOS="1848" VPOS="4182" HEIGHT="132" WIDTH="480" WC="0.99" CONTENT="newydd"/>
                <SP ID="PAG_1_SP13" HPOS="2325" VPOS="4179" WIDTH="54"/>
                <String ID="PAG_1_ST17" STYLEREFS="TXT_3" HPOS="2379" VPOS="4170" HEIGHT="144" WIDTH="828" WC="0.94" CONTENT="Machynlleth,"/>
                <SP ID="PAG_1_SP14" HPOS="3204" VPOS="4173" WIDTH="66"/>
                <String ID="PAG_1_ST18" STYLEREFS="TXT_3" HPOS="3270" VPOS="4170" HEIGHT="105" WIDTH="165" WC="1.00" CONTENT="a'r"/>
                <SP ID="PAG_1_SP15" HPOS="3432" VPOS="4170" WIDTH="66"/>
                <String ID="PAG_1_ST19" STYLEREFS="TXT_3" HPOS="3498" VPOS="4167" HEIGHT="105" WIDTH="369" WC="0.99" CONTENT="Parch"/>
              </TextLine>

The XML received back for lemmatized and translated..
<translation id="apnah00100001">
    <zone id="ZONE1-1">
        <original>Dewiswyd y Parch. ,W. S. Jones, M.A., yn gadeirydd Bwrdd Ysgol newydd Machynlleth,
            a'r Parch Josiah Jones yn is-gadeirydd. </original>
        <lemmatized>dewis y parch W S jones M a yn cadeirydd bwrdd ysgol newydd machynlleth a r
            parch Josiah jones yn is cadeirydd </lemmatized>
        <translated>The Parch. Dewiswyd W. S. JONES, M.A., THE CHAIR OF THE NEW SCHOOL MACHYNLLETH,
            A'R PARCH JOSIAH JONES IS-GADEIRYDD. W. S. JONES, M.A., THE CHAIR OF THE NEW SCHOOL
            MACHYNLLETH, A'R PARCH JOSIAH JONES IS-GADEIRYDD. W. S. Jones, M.A., the chair of the
            new School Machynlleth, a'r Parch Josiah Jones is-gadeirydd. </translated>
    </zone>

Revised template following feedback..
    <text>
        <body>
            <div xml:id="ZONE1-1">
                <div type="lemmatized">                   
                    <p>dewis y parch W S jones M a yn cadeirydd bwrdd ysgol newydd machynlleth a r
                        parch Josiah jones yn is cadeirydd</p>
                </div>
                <div type="translated" xml:lang="en">                  
                    <p>The Parch. Dewiswyd W. S. JONES, M.A., THE CHAIR OF THE NEW SCHOOL
                        MACHYNLLETH, A'R PARCH JOSIAH JONES IS-GADEIRYDD. W. S. JONES, M.A., THE
                        CHAIR OF THE NEW SCHOOL MACHYNLLETH, A'R PARCH JOSIAH JONES IS-GADEIRYDD. W.
                        S. Jones, M.A., the chair of the new School Machynlleth, a'r Parch Josiah
                        Jones is-gadeirydd.</p>
                </div>
            </div>
            <div xml:id="ZONE2-1">
                <div type="lemmatized"></div>
                <div type="translated" xml:lang="en"></div>
            </div>
        </body>
    </text>