An Automated Approach to Syntax-based Analysis of Classical Latin

  • Anjalie Field (Autor/in)
    Princeton University
    B.A., Department of Computer Science, 2015

Identifier (Artikel)

Identifier (Dateien)


The goal of this study is to present an automated method for analyzing the style of Latin authors. Many of the common automated methods in stylistic analysis are based on lexical measures, which do not work well with Latin because of the language’s high degree of inflection and free word order. In contrast, this study focuses on analysis at a syntax level by examining two constructions, the ablative absolute and the cum clause. These constructions are often interchangeable, which suggests an author’s choice of construction is typically more stylistic than functional. We first identified these constructions in hand-annotated texts. Next we developed a method for identifying the constructions in unannotated texts, using probabilistic morphological tagging. Our methods identified constructions with enough accuracy to distinguish among different genres and different authors. In particular, we were able to determine which book of Caesar’s Commentarii de Bello Gallico was not written by Caesar. Furthermore, the usage of ablative absolutes and cum clauses observed in this study is consistent with the usage scholars have observed when analyzing these texts by hand. The proposed methods for an automatic syntax-based analysis are shown to be valuable for the study of classical literature.




Adams (1973): J.N. Adams, “The Vocabulary of the Speeches in Tacitus’ Historical Works”, Bulletin of the Institute of Classical Studies 20, 124–144.

Adams (2005): J.N. Adams, “The Bellum Africum” in: Adams, J. N., Lapidge, M., and Reinhardt, T. (Hrsg.), Aspects of the Language of Latin Prose, Oxford/New York, 73–96.

Albrecht (1979): M. V. Albrecht, Masters of Roman Prose: from Cato to Apuleius, Francis Cairns.

Baaeyn et al. (1996): H. Baayen / H. Van Halteren / F. Tweedie, “Outside the Cave of Shadows: Using Syntactic Annotation to Enhance Authorship Attribution”, Literary and Linguistic Computing 11, 121-132.

Bamman / Crane (2006): D. Bamman / G. Crane, “The Design and Use of a Latin Dependency Treebank”, in: Proceedings of the Fifth Workshop on Treebanks and Linguistic Theories (TLT, 2006), Prague, 67–78.

Bamman / Crane (2008): D. Bamman / G. Crane, “Building a Dynamic Lexicon from a Digital Library”, in: Proceedings of the 8th ACM/IEEE-CS joint conference on Digital libraries (ACM, 2008), New York, 67–78.

Bamman et al. (2007): D. Bamman / M. Passarotti / G. Crane / S. Raynaud, “Guidelines for the Syntactic Annotation of Latin Treebanks (v. 1.3)”, Technical report. Medford: Tufts Digital Library.

Bamman et al. (2008): D. Bamman / M. Passarotti / G. Crane, “A Case Study in Treebank Collaboration and Comparison: Accusativus cum Infinitivo and Subordination in Latin”, The Prague Bulletin of Mathematical Linguistics 90, 109–122.

Bennett (1918): C.E. Bennett, A New Latin Grammar, Allyn and Bacon.

G. Celano / G. Crane / G. Almas et al, “The Ancient Greek and Latin Dependency Treebanks”,, (Accessed 2015).

Covington (1990): M. Covington, “A Dependency Parser for Variable-Word-Order Languages”, Technical Report AI-1990-01, Artificial Intelligence Programs, The University of Georgia Athens.

Daly (1951): L. Daly, “Aulus Hirtius and the Corpus Caesarianum”, The Classical Weekly 44, 113–117.

Diederich et al. (2003): J. Diederich / J. Kindermann / E. Leopold / G. Paass, “Authorship Attribution with Support Vector Machines”, Applied Intelligence 19, 109–123.

Eden (1962): P. T. Eden, “Caesar’s Style: Inheritance versus Intelligence, Glotta 40, no. ½, 74–117.

Gaertner / Hausburg (2013): J. Gaertner / B. Hausburg, Caesar and the Bellum Alexandrinum Göttingen.

Gamon (2004): M. Gamon, “Linguistic Correlates of Style: Authorship Classification with Deep Linguistic Analysis Features”, in: Proceedings of the 20th Annual Conference on Computational Linguistics (ACL, 2004), Morristown, NJ, 611–617.

Goodyear 1968: F.R.D. Goodyear, “Development of Language and Style in the Annals of Tacitus”, The Journal of Roman Studies 58, 22–31.

Gotoff (1984): H.C. Gotoff, „Towards a practical criticism of Caesar’s prose style“, Illinois Classical Studies 9.1, 1–18.

Grethlein (2006): J. Grethlein, “The Unthucydidean Voice of Sallust”, Transactions of the American Philological Association 136, 299–327.

Koch (1994): U. Koch, “The Enhancement of a Dependency Parser for Latin”, Research Report AI-1993-03, Artificial Intelligence Programs, The University of Georgia Athens.

Koster (2005): C. Koster, “Constructing a parser for Latin”, Computational Linguistics and Intelligent Text Processing, Lecture Notes in Computer Science, Berlin – Heidelberg, 48–59.

Kraus (2005): C. Kraus, “Hair, Hegemony, and Historiography: Caesar’s Style and its Earliest Critics”, Proceedings-British Academy, Vol. 129. Oxford University Press Inc.

Lee et al. (2011): J. Lee / J. Naradowsky / D. Smith, “A Discriminative Model for Joint Morphological Disambiguation and Dependency Parsing”, in: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1 (ACL, 2011), 885–894.

Leeman (1963): A.D. Leeman, Oratonis Ratio: The Stylistic Theories and Practice of the Roman Orators Historians and Philosophers, A.M. Hakkert.

Löfstedt (1948): E. Löfstedt, “On the Style of Tacitus”, The Journal of Roman Studies 38, 1–8.

Mambrini / Passarotti (2013): F. Mambrini / M. Passarotti, “Non-projectivity in the Ancient Greek Dependency Treebank”, in: Proceedings of the Second International Conference on Dependency Linguistics (DepLing, 2013), Prague, 177–186.

Mosteller / Wallace (1964): F. Mosteller / D. L. Wallace, Inference and Disputed Authorship: The Federalist, Addison-Wesley.

Moreland / Fleischer (1990): F.L. Moreland / R.M. Fleischer, Latin: An Intensive Course, University of California Press.

Nivre / Nilsson (2005): J. Nivre / J. Nilsson, “Pseudo-projective Dependency Parsing”, in: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics (ACL, 2005), Ann Arbor, Michigan, 99–106.

Passarotti / Dell’Orletta (2010): M. Passarotti / F. Dell’Orletta, “Improvements in Parsing the Index Thomisticus Treebank.

Revision, Combination and a Feature Model for Medieval Latin”, in: Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC, 2010), Valletta, Malta, 1964–1971.

“The Perseus Digital Library”, (Accessed 2015).

Schlicher (1936): J.J. Schlicher, „The development of Caesar’s narrative style“, Classical Philology 31.3, 212–224.

Schmid (1994): H. Schmid, “Probabilistic Part-of-speech Tagging Using Decision Trees”, in: Proceedings of the International Conference on New Methods in Language Processing, 44–49.

H. Schmid, “TreeTagger - A Part-of-speech Tagger for Many Languages”, (Accessed 2015).

Stamatatos et al. (2001): E. Stamatatos / N. Fakotakis / G. Kokkinakis, “Computer-based Authorship Attribution without Lexical Measures”, Computers and the Humanities 35, 193–214.

Stamatatos (2009): E. Stamatatos, “A Survey of Modern Authorship Attribution Methods”, Journal of the American Society for Information Science and Technology 60, 538–556.

Welch (1998): K. Welch, „Caesar and his officers in the Gallic War commentaries“, Julius Caesar as Artful Reporter, 85–110.

Beitragende/r oder Sponsor
Christiane Fellbaum, Princeton University, Yelena Baraz
Latin, Syntax, Automated