The NIKAW Project: An Infrastructure of Texts, Entities and Language Models to Study the Circulation of Knowledge in the Ancient World

  • Margherita Fantoli (Autor/in)
    https://orcid.org/0000-0003-3191-4860
  • Marijke Beersmans (Autor/in)
  • Jens Bürger (Autor/in)
    https://orcid.org/0000-0001-8900-9666
  • Evelien de Graaf (Autor/in)
  • Mark Depauw (Autor/in)
  • Alek Keersmaekers (Autor/in)
  • Bart Thijs (Autor/in)
  • Tim Van de Cruys (Autor/in)
  • Toon Van Hal (Autor/in)

Abstract

This paper presents the foundational work of the interdisciplinary project NIKAW (Networks of Ideas and Knowledge in the Ancient World), which aims to analyse social networks in ancient Greek and Latin texts through mentions of historical figures. As a critical first step, we address the challenge of Named Entity Recognition (NER) for these languages by leveraging transformer-based models enriched with domain-specific knowledge. Our experiments highlight data sparsity and annotation inconsistencies as key bottlenecks for model performance. In the second phase, we introduce a pipeline for Named Entity Linking (NEL), utilizing the Wikisource edition of the Pauly-Wissowa Encyclopedy as a knowledge base. We detail the creation of silver-standard (automatically annotated) and gold-standard (human-verified) training datasets, and report preliminary results from fine-tuning the BLINK model for NEL.

Statistiken

loading
Veröffentlicht
2026-05-08
Sprache
Englisch
Schlagworte
Large Language Model, Artificial Intelligence, Named Entity Recognition