CroALa index locorum: a gazetteer of place names in Croatian Latin texts
Our small team of Neo-Latin scholars and students at the University of Zagreb, Croatia has been awarded one of the precious Pelagios grants for a proposal to make the unknown less so — that is, to index a number of place references in our textual collection with machine-actionable and semantically annotated URNs. Here I’ll sketch why these analytical annotations matter for Pelagios, for us, for scholars of Neo-Latin literature, and for philologists in general (the text of our proposal can be found in the Github repository of the project).
The historical corpus of Croatian literature consists of writings in several languages; not only in Croatian, the language of the modern nation, but in three culture languages: Latin, Italian, German. The richest of these other-language bodies of texts is the Latin one. Latin was in continuous use in Croatia for a very long time — almost a thousand years, from the 10th to the early 20th century — and, until the year 1850, Croatian authors had published in print twice as many Latin titles as the Croatian ones.
In the digital medium, the corpus of Croatian Latin is represented most fully by the collection Croatiae auctores Latini (CroALa), an ongoing project at the University of Zagreb; I’m one of the editorial team of CroALa. The collection currently comprises some 5.7 million words in 467 documents, written between 976 and 1984. These documents remain, however, little known even to the historians of Croatian literature, because the literary canon (as formed after 1850) included first and foremost works in the national language, with only perfunctory praise for everything else. With about 5 million speakers, Croatian is a small language; even today it is easier to find fluent scholarly readers of Latin than of Croatian. But to the fluent scholarly readers of Latin — trained on the ancient literature — the so-called Neo-Latin literature in general, and Croatian Latin particularly, is, again, an unknown territory.
This situation has led us to start exploring ways to make this unknown territory more accessible and hospitable. One of such ways is the well-known and deceptively humble device of the index, “by which”, according to Jonathan Swift, “the whole book is governed and turned, like fishes by the tail.”
Today the internet and its vast network of machine-actionable links make it possible to have a better index than it was even conceivable in the medium of print. Now we can provide an index to a whole collection — comprising not only one, but hundreds of books; we can also provide more, and more specific, and more fine-grained information about each index entry and its occurrences. For example, besides stating that a word Papiae in a certain document or a set of documents refers to a place name of a Renaissance city called Pavia in Italian, we can also state that the word Ticinium elsewhere refers to the same place name. We can also assert that both occurrences refer to the city at the same historical period, and we can claim that both are non-fictional references to a place (as opposed to reference to a fictional place, or to a rhetorically personified entity).
Such analyses, which actually constitute a lot of what philologists usually do, are typically made accessible and readable by humans in scholarly articles, monographs, or commentaries; nowadays, the analyses can also be made actionable by machines. There are very precise mechanisms (such as CTS URNs) for pointing from an annotation to a specific location in a specific version of a text (the so-called analytical exemplar). And, of course, there is Pelagios, as a methodological framework and a curated set of best practices for annotating and linking place references in various sources.
The CroALa index locorum, therefore, aims to make CroALa, as a textual collection, more accessible by providing an initial set of generally well-known place names referred to in the generally less-known Croatian Latin texts. The index will also contribute to the body of knowledge and methods aggregated in Pelagios by annotating a significant number of Early Modern place references, and by claiming in these annotations that a specific reference denotes a place at a specific time (or that the temporal dimension of a place reference is intentionally left undefined). We plan to take into account not only references to “real” places (for example, Rome as the capital of the Roman Republic and Empire), but also to places which are rhetorically transfigured (for example, personified — Rome which owes something to Camillus in a verse by Janus Pannonius: Multum Roma suo debet reparata Camillo), or to places which are imaginary (for example, the Underworld or the Heavens). Finally, being textual scholars, we also intend to take into account the lexical (morphological) form of the reference, providing analytical statements about the part of speech of a certain place reference (that it is, for example, a proper noun in the ablative case). The analyses we prepare will be conveyed and made interoperable through a semantic data models (RDF); the work of philologists Neel Smith and Christopher Blackwell (for example, their Abracadabra module of the CITE Architecture) has helped us understand how to simplify the process of recording the information and connecting it with a passage in a text — which would, again, be of interest to anyone wishing to implement our workflow for their own references.
We are looking forward to the next four months of the ménage à trois between places, philology, and linked open data!