Annotating Named Entities in the Trilingual Inscription at Ka’ba-ye Zartošt (ŠKZ)
Identifier (Artikel)
Abstract
This study examines proper names in the trilingual inscription of Shapur I at Ka’ba-ye Zartošt (ŠKZ) located in Naqsh-e Rustam, Fars province, Iran. We introduce a corpus of Greek, Middle Persian, and Parthian versions of the inscription aligned at both sentence and word levels, using the Ugarit translation alignment tool. Through manual extraction and categorization, nearly 400 Named Entities (i.e., proper names) were identified and classified as persons (PER), locations (LOC), or location derivatives (LOCderiv). The paper addresses methodological challenges encountered during the alignment of the text, as well as the extraction and classification of Named Entities, including ambiguities in determining proper names, variations in how some names have been recorded across different versions, and complexities in maintaining consistency in categorizing names across various languages. Additionally, we highlight the value of the aligned corpus as a lexicographical resource beyond Named Entity annotation. All datasets, including the aligned versions of the text and the extracted Named Entities, are openly accessible via GitHub and Zenodo to provide a foundation for further historical and computational research. Lastly, we explore the possibility of adding further annotation layers and linking the corpus to other datasets.
Statistiken

Lizenz

Dieses Werk steht unter der Lizenz Creative Commons Namensnennung - Weitergabe unter gleichen Bedingungen 4.0 International.


