KIMA02 pains: thoughts about projects and infrastructure
It has been a rocky time for the KIMA team. While working on KIMA02 we were also busy presenting KIMA in the Autumn season: discussing “Place, Space and Conceptual Change: a view from language” in the 20th International Conference of Conceptual History in Oslo, showing off to the Jewish studies community at the 17th World Congress of Jewish Studies , where we participated in a fascinating linked open data session (alas, we somehow dropped off the recording, so you can only see our concluding teaser for the poster session), and then at the EVA / Minerva XIVth Annual International Conference for Professionals in Cultural Heritage, also in Jerusalem:
Following these presentations, we were approached by several potential users and discussed several promising directions of collaboration, for example, Footprints, a project on the history of Jewish books that maps the traces left on book copies for the people who read them, bought and sold them, and left their marks; a correspondence study by Elie Fischer of responsa networks among Rabbis in Eastern Europe, a mapping platform for Modern Jewish culture and finally with the FID Judaica, the German expert information service for Jewish studies.
Jewish Cultures Mapped (JCM) | A web-based platform developed by Da’at Hamakom http://www.jewish-cultures-mapped.org/
. The interest in KIMA as resource was helpful in modelling our API and encouraging in itself, and yet, it made us face a serious dilemma:
KIMA was conceived as an attestation based historical gazetteer and our interest and efforts in KIMA01 were focused on this. In this, in our perception, was its added value and innovation. Our data, which consisted of around 300,000 attestations in the end came up to just 1500 places mentioned in the historical sources. And though we are populating it with more attestations from our 4 data sources for KIMA 2, we will probably not increase the number of places as significantly.
Yet what our potential users needed, whether they were thinking of using KIMA through Recogito or connecting to our API was an all encompassing gazetteer; the attestation information and the historicity of the names were of no interest to them. With our limited resources, we had to decide where to invest our time. In a nutshell: in order to be an infrastructure, we had to compromise on our own project vision of having all variants dated and connected to a referenced and linked attestation, and simply include all known Hebrew script variants we could get.
We therefore went back to our external ID resources, the large and famous gazetteers – Geonames, Wikidata and the NLI thesaurus (Agron). These were previously thought as external sources for linking, validation and verification of our set of attested places, but now we started working top down, rather than bottom up. We extracted all geonames places that had hebrew variants (13593), all the places in a thesaurus of the national library of israel (9365), with their viaf id’s, and all wikidata items that had hebrew forms and variants, and our main work now is to match them to each other and validate.We are interested in also checking the Getty thesaurus data but didn’t get there yet. This too is a lot of work, which we do not expect to finish this year, but we have all intention of continuing.
We will of course keep our database of attestation and the information it entails and aspire to make it bigger, but this will have to wait for a later stage in KIMA’s development.