Thanks, Piotr,

Yes, intuitive is subjective and depends on a lot of things, including the query environment.    In my environment, people use queries in XPath or in XQuery.  Anything that requires them to parse the insides of an attribute makes it much harder.  And one of my goals is to just import this into BaseX or eXistDB and have it run efficiently, that doesn't work well with anything buried inside an attribute, the database doesn't know that it needs to build an index on it.  I also like to be able to treat word groups in the same way, many queries will look at a group of words that may be a phrase, clause, or individual word that has a particular semantic role, and I like as much commonality as possible in such queries.

My use cases and requirements all involve doing specific queries on this dataset and other related datasets, I want to simplify those queries and make them run efficiently, under the assumption that sophisticated users are writing those queries in a Jupyter Notebook environment.  So that's what I am optimizing for.  Actually, I am also optimizing for one other thing, the ability to create and edit query trees using a little language called Treedown (, then use a parser to create an XML representation.  So it's quite possible that my use cases and requirements are substantially different from those of the other initiatives you pointed out.

I already responded to Toma in a separate message, any thoughts on the markup I suggested there?  Obviously, if I strip the namespaces when importing into a database, I eliminate the advantage of having them except for TEI conformance.  But TEI conformance is the main advantage I am looking for here.

Thanks for pointing me to that mailing list, I will sign up.