Natural Language Processing

My primary project is the Classical Language Toolkit (CLTK), a framework for natural language processing (NLP) for the languages of Ancient, Classical, and Medieval Eurasia. I am responsible for the project's oversight, as well as editing the libraries for Ancient Greek and Latin. See here for materials from my public talks on the CLTK.

For the Pema Ts'al Orthographic System (for which I am the lead developer) I have customized fonts and a keyboard which introduce several new punctuation characters to Tibetan orthography, in order to aid beginners in reading the language.


While in academia, I published a few articles and wrote a dissertation. Of my publications, several of the more interesting are one on what comics have to offer the study of literature (from Oxford University Press) and a short piece on Etruscan medicine, which to my utter surprise took on a life of its own as a foundation for contemporary pharmaceutical research. My dissertation is a network-theoretical study of Julius Caesar's organization and leadership of the Roman army.

Recent posts


I born and raised in Kirkland, Washington. I now reside in San Francisco, where I work as a Principal Research Scientist, specializing in NLP. My formal education was in Classics (BA, Reed College; PhD, NYU).