> On Oct 25, 2018, at 7:26 PM, Markus Flatscher <[log in to unmask]> wrote:
> Contrary to a previous comment on this thread, ingesting XML (or TEI) content into MarkLogic (e.g., few large files to be split at ingest time, many small files to be loaded performantly in batch, etc.) could not be any easier than it is at the hands of a competent junior developer or DBA. MarkLogic Content Pump ("mlcp") is one in a suite of mature and well-documented tools for this kind of thing. (ref: https://docs.marklogic.com/guide/mlcp)
Why should one have to split large files at ingest time? My experience loading a few hundred medium-sized files into MarkLogic using default settings was that it blew up frequently with XDMP-FRAGTOOLARGE errors, and running simple queries on the files I did manage to load blew up frequently with "expanded tree cache full" errors. There were many hours of tuning and fussing and discussion with the experts on the MarkLogic dev mailing list. The suggestions involved writing piles of code using MarkLogic's proprietary XQuery extensions and I believe it was also suggested that I should redesign my documents to have smaller fragment sizes. I don't think it's necessary to explain on the TEI list why one can't just make two scenes out of a scene from a play because some database product can't handle fragments of a certain size.
Basically, I think MarkLogic is like Oracle. If you have the money to spend on the product, you probably also have the money to spend on the army of people it will take to develop for and support it. I have no doubt that it works reliably if you have the resources to pour into it.
Craig A. Berry
"... getting out of a sonnet is much more
difficult than getting in."