Dear all,

Following my previous email, I just want to tell you that I found a solution.
For those who might be interested for parsing TEI data within R:

#getNodeSet from //ns:, then
listInterp <- unlist(nodelist)
 for (i in 1:length(interp)) {
    (a=(KTU = (xmlGetAttr(interp[[i]],"ana")))), 
    (b=(verb.category = (xmlGetAttr(interpRef[[i]],"ana")))),
    (c=(Character = (xmlGetAttr(interpPers[[i]],"ana")))),
    (d=(State =(xmlGetAttr(interpPers_State[[i]],"ana")))),
    (e= Location = (xmlGetAttr(interpPlace_Loc[[i]],"ana")))
   listInterp[[i]] <- (paste(word(word(a,-1)), collapse=": ", (word(b, -2, -1)), (word(c, -2)), (word(d, -1)), (word(e, -1)))) #to select attribute values
   listInterp <- unlist(lapply(listInterp,gsub,pattern="#",replacement="")) #to replace # by empty space  

> Result for listInterp (sample)
[1] "ktu1-3_ii_l5b-6a verb.competition contend ANT active outside" 
[2] "ktu1-3_ii_l6b verb.competition contend ANT active outside"   
[3] "ktu1-3_ii_l7 verb.emotion humiliation ANT active outside"    
[4] "ktu1-3_ii_l8 verb.emotion humiliation ANT active outside"    




Vanessa Juloux | Ph.D. candidate

» Ecole Pratique des Hautes Etudes  (EPHE, France), Paris Sciences et Lettres (PSL, France) Research University 
» Cultural anthropology of Ancient Near East, 
» Data coordinator & digital humanities monitoring (EPHE, PSL)
» Chair Membership and Outreach Sub-committee for Europe (American Schools of Oriental Research, USA)
Mobile + WhatsApp: +33 (0) 6 98 97 02 02   
@vjuloux, skype: mosioatunya
Le 6 avr. 2017 à 00:43, Vanessa Juloux <[log in to unmask]> a écrit :

Dear all,

I am currently parsing TEI data within R, amongst which @ana attributes in a <ref> that have several values.
My apologies for this post in the TEI list, but since parsing TEI data is very specific, I hope you will be able to help me since I am looking for an answer for few days now.

Do you know how is working the hierarchy of multiple attribute values?

Example in TEI: <ref ana="whatAction #ktu1-3_ii_l6b_tḫtṣb #confrontation #action">Action, subcategory confrontation
                                       <stage ana="whatResult #result #defeate_ofOpposition"/></ref>

The parsing in R is done with getNodeSet and xmlGetAttr functions:

interpRef <- getNodeSet(doc,"//ns:ref[contains(@ana, 'whatAction')]", ns)
interpRef_ana <- for (i in 1:length(interpRef)) print(paste(xmlGetAttr(interpRef[[i]],"ana")))

[1] "whatAction #ktu1-3_ii_l6b_tḫtṣb #confrontation #action"
Do you know how I can select relevant attribute value from @ana? I would like to have only:
[1] "#ktu1-3_ii_l6b_tḫtṣb #confrontation"

A suggestion?

In advance, thank you very much.