Kyriaki Giannikou, Typologies for the study of historical Greek texts: Perspectives from two UGent projects

Abstract

I will discuss the typologies developed by two digitally-oriented projects from the University of Ghent, EVWRIT and DBBE, for organising, categorising, and describing documentary and literary historical textual material in Greek. The EVWRIT (Everyday Writing in Graeco-Roman and Late Antique Egypt. A Socio-Semiotic Study of Communicative Variation) project focuses on Greek documentary texts, examining their external features to uncover social meaning in the communicative and administrative contexts of the Ptolemaic, Roman, and Byzantine periods. Its main goal is to illuminate the relationship between form and content in these historical texts, providing a multi-aspect and well-structured framework for analysis. Meanwhile, the Database of Byzantine Book Epigrams (DBBE) stores and presents metrical paratexts found in the margins of medieval Greek manuscripts, primarily focusing on original texts and scribal choices, while grouping and linking them to their edited versions. The DBBE focuses heavily on metadata, contextualising the texts through details of their production (date, place, manuscript, etc.) and also their handling by secondary literature, if present. By comparing the typologies used in both projects, I will highlight different approaches in structuring and presenting historical textual data, showcasing how they can offer equally valuable insights.

Practical information

This lecture will be given at the Typology workshop, organised by the grammateus project at the University of Geneva on 21-22 March 2025.

Date & time: Saturday 22 March 2025, 15:50

Location: Amphithéâtre 012A – Battelle D (Route de Drize 7, 1227 Carouge)

Eleonora Lauro & Colin Swaelens, Enhancing and Visualising Textual and Material Analysis of Manuscripts: A Graph-Based Approach

Abstract

Manuscripts are no longer studied as purely textual witnesses in a bottom-up approach as in stemmatological philology, but also as physical objects. Current computational developments enable new top-down approaches. Graph databases visualise in an intuitive way complex relationships between chunks of data, coming from – in our case – metrical paratexts of the Database of Byzantine Book Epigrams (Ricceri et al. 2023). We carried out a pilot-study in which we clustered 200 occurrences of the same epigram based on textual differences and linguistic annotations (Swaelens et al. 2024). This already revealed complex relationships in the graph representation between clusters of texts, triggering scholars to dive deeper into the reasons why they are grouped. The current paper explores how a graph-based approach can present even more intricate connections between manuscripts by adding metadata (date, place, scribe) to the textual data. A qualitative analysis of both bottom-up and top-down approaches reveals that they complement each other and provide researchers with new perspectives.

Practical information

This lecture will be given at the “International Medieval Congress”, organised by the Institute for Medieval Studies at the University of Leeds. IMC 2025 will take place from Monday 07 July to Thursday 10 July 2025.

Date & time: Wednesday 9 July 2025, 14:15

Location: Leeds

More information about this conference and the full programme can be found here.

Kristoffel Demoen, Les paratextes métriques (‘book epigrams’) dans les manuscrits grecs comme objets matériels et comme liens textuels entre les producteurs des manuscrits, les œuvres transmises et les lecteurs

This lecture will be given at the ‘Séminaire Cultures anciennes et temporalités 2024-2025″, organised by The Research Center HiSoMA (Histoire et Sources des Mondes Antiques).

Date & time: Friday 21 Feruary 2025, 9:30-12:30

Location: Maison de l’Orient et de la Méditerranée Jean Pouilloux (86 rue Pasteur – Lyon 7e), Salle Reinach (4e étage)

More information can be found here.

Crash Course in Greek Palaeography

The Greek section of Ghent University, in collaboration with the Research School OIKOS and the Royal Library of Belgium, offers a two-day crash course in Greek palaeography. The course will take place on 27-28 May 2025 in Ghent and Brussels. It is intended for MA, ResMA and doctoral students in Classics, Ancient History, Ancient Civilizations, Byzantine studies, Medieval studies and related fields. Students must have a good command of Greek. The course offers an introduction into Greek palaeography from the Hellenistic period to the end of the Middle Ages and is specifically aimed at acquiring practical skills for research involving literary and documentary papyri and/or manuscripts. Participants will gain hands-on experience with original papyri housed at Ghent University Library and with manuscripts from the Royal Library of Belgium in Brussels.

Programme

The course will take place over two full days, with one session in Ghent on Tuesday, 27 May, and the other in Brussels on Wednesday, 28 May. Specialists in Greek palaeography will deliver lectures providing a chronological overview of the evolution of Greek handwriting, accompanied by introductions into the material features of both papyri and codices. The lectures will be followed by practical sessions, consisting of supervised reading of selected extracts from papyri and manuscripts in small groups. There will be guided exhibitions of selected papyri (in Ghent) and medieval manuscripts (in Brussels).

* detailed schedule to be announced *

Practical information

The study load is equivalent to 2 ECTS credits (2×28 hours). In preparation for the course, participants will be required to read secondary literature which will be distributed several weeks in advance. Additional materials will be provided in order to help develop further reading skills after the course.

There is no participation fee for this course. Lunches will be provided on both days free of charge. Travel and accommodation expenses are the responsibility of the participants. The train connection between Ghent Sint-Pieters Station and Brussels Central Station is frequent, with a travel time of less than 40 minutes. Participants may choose to lodge in either city.

The course will take place at the following venues:

Registration

Prospective participants should register by sending an e-mail to grigory.vorobyev@ugent.be with a short motivation letter (approximately 300 words), detailing their academic background, research interests and motivation for attending the course. Priority will be given to MA and doctoral students associated with OIKOS and those who have not previously had the opportunity to study palaeography. The deadline for registration is 1 March 2025. Applicants will be notified of the outcome shortly thereafter.

Colin Swaelens, Part-of-Speech Tagging & Lemmatisation in Unedited Greek: Simple Tasks, Complex Challenges?

Abstract

In today’s landscape of language technology, dominated by large language models, tasks like part-of-speech tagging and lemmatisation receive less attention in current NLP research. However, these tasks still pose significant challenges, especially for under-resourced, morphologically rich languages like Ancient Greek. Our project focuses on the verbatim transcriptions of Byzantine marginal poetry stored in the Database of Byzantine Book Epigrams (DBBE). Due to the highly interconnected nature of the poems, we aim to eventually perform similarity detection across the corpus. As a first step, we sought to annotate the DBBE with part-of-speech tags, morphological analyses, and lemmas. Although research on these tasks dates back to more straightforward rule-based systems from the 1970s, current taggers struggle with these unedited texts. The inconsistent orthography — largely due to itacism — adds to this complexity. To mitigate these issues, we trained a transformer-based language model encompassing classical, medieval, and modern Greek. Our experiments, however, revealed that fine-tuning the model for each annotation task was not always fruitful. There is a growing tendency to address such challenges with a multi-task head, allowing the model to process multiple annotations concurrently, drawing inspiration from cognitive psychology. This raises the question: will this more intricate solution outshine the seemingly more transparent methods of the past?

Practical information

This lecture will be given at the Computational Humanities Research Group Seminar Series, organised by the Department of Digital Humanities of King’s College London.

Date & time: Tuesday 10 December 2024, 4:00 pm

Location: Bush House, Strand Campus (30 Aldwych, London) & online

More information about this conference and the full programme can be found here.

Kyriaki Giannikou, Assessing and Reassessing Formulaicity: are editorial practices a blessing or a curse?

Abstract

Formulaicity is a widely discussed concept in the study of historical Greek, primarily due to the influence of the Homeric epics, where it is traditionally understood to arise from oral contexts where formulaic sequences reduce processing effort during lengthy recitations. Besides that, formulaic language also appears in entirely written contexts, such as post-classical Greek administrative and legal documents, where high standardisation meets the need of accuracy and efficiency (see e.g. Nachtergaele 2023; Saradi 2019). The corpus I focus on, Byzantine book epigrams — short, metrical texts found in the margins of Byzantine manuscripts — presents a unique case. These paratexts, embedded in the medieval manuscript tradition, blend literary and documentary functions without any oral performance context, oscillating between practical precision and creative expression. This paper explores a methodological challenge in studying formulaic language within historical Greek corpora, focusing specifically on the Database of Byzantine Book Epigrams.

Even recent comprehensive research on Homer’s formulaic language (Bozzone 2024) relies on modern editions of the Homeric epics that attempt to reconstruct an ‘archetype’ based on medieval manuscript ‘witnesses’. In contrast, the DBBE diverges from strict adherence to traditional editorial practices by presenting epigrams preserving all original scribal choices (‘Occurrences’) while also offering ‘normalised’ versions (‘Types’) that group similar instances of the originals (Ricceri et al. 2023). This raises questions: To what extent can we rely on edited texts to analyse formulaicity? How might editorial choices, driven by the desire for a cohesive text, obscure the original variability of formulaic sequences? Does the interaction between formulaicity and editorial practices facilitate research, or does this create the impression of greater fixedness in formulae, potentially skewing certain aspects of the analysis?

This paper explores the potential impact of editorial intervention on formulaicity research, advocating for a more flexible methodology that balances the use of both edited and original sources. Through a case study on supplications for salvation within a subset of the DBBE corpus, I will demonstrate how formulaic expressions function in this hybrid referential-poetic (cf. Jacobson 1960) context, and how editorial practices may shape our understanding of formulaicity. Ultimately, this study seeks to position this material within the broader framework of formulaicity research and to discuss the implications of editorial practices for linguistic research in historical corpora.

Practical information

This lecture will be given at the conference ‘Formulaic Language in Historical Linguistics: data, methods, tools, and theory’, organised by the Academy of Finland project “The learning of Latin in the 8th to 12th century: a linguistic approach to medieval Latin literacies” in collaboration with the Classical Philological Society of Finland.

Date & time: to be confirmed

Location: Tieteiden talo (Kirkkokatu 6, Helsinki, Finland)

More information about this conference and the full programme can be found here.

Colin Swaelens, Similarity Detection: A Starting Point for Greek

Abstract

Antique literature survived thanks to scribes painstakingly copying texts from one manuscript to the other, prior to the art of printing. Occasionally, these scribes added metrical paratexts to the manuscripts, i.e. texts standing next to the main text (Genette, 1987) and introduced in Byzantine scholarship by Lauxtermann (2003) as book epigrams. Ghent University’s Database of Byzantine Book Epigrams (Ricceri et al., 2023) stores more than 12,000 of such epigrams, being verbatim transcriptions precisely as they are found in the manuscripts. This entails that the Greek of these epigrams is interspersed with orthographic inconsistencies, mainly due to phonetic changes like the itacism. These verbatim transcriptions are called occurrences and are grouped under one or more so-called types, a readable representation of its occurrences in standardised, classical Greek. Eventually, we aim to develop a dynamic system to group hemistichs, verses and epigrams based on distinct similarity measures in order for scholars to find all kinds of similar texts instead of only the ones that pop up in their mind. While developing those similarity measures, just like any other algorithm, evaluation is an essential part of the development process. However, a gold standard for the evaluation of verse similarity measures does not exist. At this point, we already conducted a pilot study on pairwise annotation of 2 verses with 10 annotators. Each verse was set off alongside six pairs of verses, of which the annotator had to mark the most similar one in their opinion. The inter-annotator agreement (IAA) yielded an agreement score of 57.69%, which is seen as a moderate agreement (Landis & Koch, 1977). This agreement score is the arithmetic mean of the agreement between each pair of annotators, as all annotators annotated the exact same set of verses. Despite the rather modest size of this pilot study, it is possible to unravel the distinct lines of reasoning of the annotators. They did not receive detailed instructions for the annotation process, because of which every annotator was free to have their own focal point. The most remarkable of those focal points was the metre. One of the annotators based their judgement on the amount of syllables a verse counts. The majority, however, seemed to take syntax as a decisive factor to determine the most similar verse; semantics were only deciding, if the syntax of both options was identical. While the gold standard is being annotated, we already started computing similarity between words. These similarities will, in a next stage, be used to compute similarity between (half) verses. The main goal of the experiment is to find out whether transformer embeddings take into account enough context to find identical or similar words with deviant orthography.

Practical information

This lecture will be given at the ‘Computational Approaches to Ancient Greek and Latin Workshop’, organised by KU Leuven and the University of Groningen. This workshop series started in 2021 with the aim of further exploring the potential of computational approaches (Natural Language Processing) applied to Ancient Greek and Latin. The 2024 edition will be held hybridly on November 28th and 29th, 2024.

Date & time: Friday 29 November 2024, 13:45-14:30

Location: KU Leuven: Mgr. Sencie Instituut (Erasmusplein 2, 3000 Leuven, Belgium) & online

Register via this link. Registration for in-person attendance is not possible anymore. The deadline for registration for online attendance is 27 November 2024.

More information about this conference and the full programme can be found here.

Kristoffel Demoen, Kyriaki Giannikou & Colin Swaelens, The Database of Byzantine Book Epigrams. Paratextual Poems from the Margins of Medieval Manuscripts to a Searchable Digital Corpus

This lecture will be given at the 8th International Byzantine Seminar Lecture Series (2024) on “Digital Methods for Byzantine Studies”, organised by the Institute for the History of Ancient Civilizations at the Northeast Normal University in Changchun (China), in collaboration with the Department og Byzantine and Modern Greek Studies at the University of Cologne and the Department of Historical and Classical Studies at the Norwegian University of Science and Technology.

Date & time: Thursday 21 November 2024, 11:00 am (CET)

Location: online via Zoom

Registration is free, but required. The Zoom link will be provided upon registration. To register or for more information, email with “IBSLS Registration” to liq762@hotmail.com.

LW Research Day 2024: poster session

The fourth LW Research Day will take place on Wednesday 27 November 2024, in the Ghent University Museum (GUM). Central theme is ‘From Source to Understanding’.

What is the role of interpretation in our journey from studying source material to scientific understanding? Indeed, that journey can never be devoid of interpretation, which, in many cases, serves as the quintessential bridge between source material and understanding, whether it pertains to a historical study based on ego documents, the archaeological perspective on the material culture of the past or the anthropological view of human behaviour. Not infrequently, interpretation itself becomes the object of research. For instance, translation scholars examine translation choices that result from interpretations. Literary and art scholars investigate works that themselves provide an interpretation of the world in which they originate and the world they create. Similarly, language itself reflects a particular understanding of the world in a historical and sociological sense, which linguists further explore. In times of digital humanities, the interpretation of (big) data by AI becomes not only conceivable but even the norm. What do interpretation and hermeneutics signify for our fields today? What constitutes a successful or legitimate interpretation, and what are the pitfalls of interpretation?

The PhD students of the DBBE team will present a poster on their research projects in the framework of the Database of Byzantine Book Epigrams.

  • Kyriaki Giannikou – Dealing with Building Blocks of Expression: Formulaic Elements & their Creative Variations in Byzantine Book Epigrams
  • Eleonora Lauro – Epigrams in Context: Glimpses into Medieval Southern Italian Book Culture

More information can be found on the LW Research Day website.

Kyriaki Giannikou, Navigating Digital Frontiers: Unveiling Formulaicity in Byzantine Book Epigrams

Abstract

Byzantine book epigrams, featuring as paratexts in manuscript margins, seamlessly intertwine poetic expression with practical details, illuminating aspects such as the manuscripts’ patrons and the identities of the scribes involved in transcription. Although deeply rooted in traditional book production practices and very formulaic in nature, these epigrams present noteworthy linguistic variation. While their formulaicity has been acknowledged, a thorough exploration of the formulaic sequences present in the Database of Byzantine Book Epigrams (DBBE) or similar corpora remains a gap in current research. My research, to be conducted on the well-established DBBE corpus, acts as a bridge between linguistic research on formulas inherent in everyday speech and those studied within the context of oral poetry.

This interdisciplinary project, adopting a corpus-driven approach, seeks to combine close-reading along with digital methods for navigating a vast corpus of Byzantine book epigrams. This research addresses the challenge of identifying formulaic constructions (i.e. pairings of form and meaning in the context of Construction Grammar) that function as “verse building blocks” and their variation within a historical linguistic corpus that combines poetic expression and practical information. However, the digital journey of pattern identification encounters challenges arising from inherent complexities of Greek – from flexible syntax to extensive morphological variety – compounded by great linguistic variation across registers, ranging from Homeric and classicizing Greek to medieval forms interwoven with vernacular elements. The absence of critical texts for numerous epigrams further complicates matters, preserving the idiosyncrasies of original scribal choices on the one hand, but impeding uniformization for digital analysis on the other.

This presentation serves to illuminate the challenges inherent in working on Byzantine paratextual material in the Digital Humanities context of a project that endeavours to unravel the intricate linguistic nuances within Byzantine book epigrams, displaying commitment to deeper understand the complexities inherent in the intersection of Byzantine literature and Digital Humanities.

Practical information

This lecture will be given at the international workshop ‘The Impact of Digital Methods and Approaches on Ancient Studies Research‘ (13-14 May 2024, Berlin).

Date & time: Monday 13 May 2024, 4:40 pm

Location: Freie Universität Berlin (Hittorfstraße 18, 14195 Berlin)

 

More information about this workshop and the full programme can be found here.