A place for discussion about anything related to the Recogito annotation platform
New use case. Herbarium Specimens
May 26, 2016 at 2:13 pm #1255
Hi Pelagios Community,
I’m Robert Cubey – Plant Records officer of the Royal Botanic Garden Edinburgh, I saw a small GIF showing a use case for the #Recogito2 on twitter and thought how it might be interesting to add another use case to your community.
We have a herbarium with 3 millions specimens (sheets of card with a bit of plant stuck to them) – the value of the sheets is that are a referable resource as to what was growing where and when.
We are in the middle of a push to database and image (/digitise) our specimens and out of the 3 million we have 3/4 million databased and 333K imaged.
In an effort to speed up the imaging we have been recording minimal data from the sheets, usually the information from the location of the sheets i.e. on the cupboard door not on the sheet it’s self – and a country of collection; but there is a lot more information on the sheets.
So we have been investigating was of capturing this information – there are a few ways of doing this via OCR tools, crowd sourcing etc. but in some cases it is more than transcription that we would like to capture – in an effort to make the material as useful as possible, we need to turn the locality text string into a LAT/LONG coordinates (or provide a polygon in which the material was collected i.e. we need to geo-reference the specimen – and this is where I think the Recogito2 software could be useful. There have been some other attempts at doing this but nothing – really “usable” as yet.
The other use case for the software is adding determinations to a specimen, over time botanists are looking at a specimens and re-naming and re-classifying them – to do this they stick a paper slip onto a specimen with their opinion (agree / disagree), but how do we do that in for digital herbarium – re-image for every re-determination would be a massive issue, so how do we add annotations (easily and quickly).
The RBGE herbarium dataset is available from http://elmer.rbge.org.uk/EOL/dwca/data/darwin_core.zip in the schema DWC (Darwin core) see http://rs.tdwg.org/dwc/ links to the images are available in the zip. – there is JSON data resources available.
May 30, 2016 at 10:09 am #1278
thanks for introducing yourself & your use case! I’m very interested in finding ways in which Recogito might be useful to you. I must admit though that it’s probably tricky, since there are a few “design differences” in Recogito that won’t make it direcly usable to you. But at least in the medium/longer run there may be enough potential to think about what it would take in terms of modification.
The most important design difference (resulting purely from our own project history) is that Recogito is built for individual users, each with their own (small-ish) set of (text or image) documents, which they upload themselves and then annotate alone or with a group of collborators. Obviously, any form upload wouldn’t make sense in your case. Neither in terms of manual effort required, nor in terms of the data duplication that would arise. So what we’d need is some sort of direct integration with your database.
Then there are two additional design differences – but these are probably less serious and/or not a problem at all.
- First, users generally annotate “inside” the document – a paragraph of text, a rectangular section in an image – as opposed to adding metadata to the document as a whole. But we also had the scenario previously, where we needed to record things such as “findspot” or “place of origin” for a specific document as a whole. So this may come back in one form or the other, anyway.
- Second, Recogito is fundementally based on controlled vocabularies througout. On the one hand, that’s definitely a perfect match for adding classifications, of course. On the other hand, this also applies to geo-tagging: it’s not possible to assign a coordinate, or polygon directly; you always assign a link to a “place” (in terms of an authority list of places), which then has geometry attached to it (for mapping and geo-search). If you already have a gazetteer in your system for this purpose, it could be integrated, so things will be fine. I just want to clarify this principle upfront, as it puts a (deliberate) limit on your freedom to associate arbitrary coordinates/regions to an annotation.
But going back to the integration issue: In fact, I’ve been thinking about scenarios such as yours for a while, as many of our partners have similar databases; and in Pelagios we do, so far, not have any tool support for this. In theory, I can see two potential future integration options:
- Bringing “a little Recogito functionality” into users’ existing databases. I.e. having a small “widget” in your own data-entry forms that pops up when you fill e.g. a place metadata field (similar to the annotation popup shown in the animated GIF).
- “Attaching your database to Recogito”. I.e. building a technical interface through which Recogito can use your DB backend as document repository, while keeping the rest of the UI as it is.
I think both these option would be quite feasible, but will require extra effort to implement them properly. (Or at least it would require some thought first, in order to get to a serious estimate.)
Some sort of trial/proof-of-concept type thing, however, would probably be feasible in much less time (few days work?). But I need to say clearly upfront that none of this is planned (and budgeted) in our current project plan. So whatever we do, you’d either need to try and look into this yourself (which we’d be happy to support!); or, alternatively, by trying to find some extra funding for a joint activity, if this is an option for you.
Either way, I’m definitely interested in looking further into Recogito/database-integration issues. So any further thoughts and input you might have is very valuable & welcome to us!
You must be logged in to reply to this topic.