Natural Language Processing

My primary project is the Classical Language Toolkit (CLTK), a framework for natural language processing (NLP) for the languages of Ancient, Classical, and Medieval Eurasia. Having founded the project, I am now responsible for overseeing its growth and technical excellence. See here for materials from my public talks on the CLTK.

For the Pema Ts'al Orthographic System (for which I am the lead developer) I have customized fonts and a keyboard which introduce several new punctuation characters to Tibetan orthography, in order to aid beginners in reading the language.


While in academia, I published a few articles and wrote a dissertation. Of my publications, several of the more interesting are one on what comics have to offer the study of literature (from Oxford University Press) and a short piece on Etruscan medicine, which to my utter surprise took on a life of its own as a foundation for contemporary pharmaceutical research. My dissertation is a network-theoretical study of Julius Caesar's organization and leadership of the Roman army.

Recent posts


I born and raised in Kirkland, Washington. I now reside in San Francisco, where I work as a research scientist specializing in NLP. My formal education was in Classics (BA, Reed College; PhD, NYU).