Hi TEI List,
I wanted to bring your attention to a (very) recent update to Pandoc, which adds a TEI "writer." This means that Pandoc can convert any of the various file formats it accepts as input (HTML, docbook, LaTeX, docx, markdown, and others) *to* TEI. The TEI that Pandoc outputs validates against the current version of the TEI Simple [https://github.com/TEIC/TEI-Simple] DTD. It is necessarily a lossy conversion because of how Pandoc works [1]. Additionally, the template [https://github.com/jgm/pandoc-templates] the wrtier currently uses to generate a complete, valid, standalone output affords only a very limited reproduction of the options within the teiHeader (but that can be improved!).
Pandoc is a powerful tool and if you'd like to try out the TEI writer, you can clone the master branch from github [https://github.com/jgm/pandoc] and then build from source (some instructions: [http://pandoc.org/installing.html]; but I've had good luck building/testing pandoc with stack [http://haskellstack.org]).
Should you find any problems or have recommendations for improvements, please file an issue on GitHub [https://github.com/jgm/pandoc/issues].
Best,
Chris
1. Pandoc converts all its inputs into an intermediate format which its various writers then convert to output. That intermediate format has a notion of *emphasis* (which usually means "italics"), but lacks the semantic categories in TEI which would be rendered with italics. (The result is that the output relies on `<hi rendition=...>` for many things.)
|