Linked Places, Dec 2016
On December 15, 2016 I presented work on a project titled Linked Places at the Linked Pasts meeting in Madrid, on behalf of my collaborators Lex Berman and Rainer Simon, This second phase of Linked Pasts was generously supported in part by a Pelagios Commons Resource Development Grant, The following final report replicates a 10 December posting I made on my own blog, at http://kgeographer.org. The experimental demonstration app produced can be viewed at http://topotime.org/linkedplaces; the GitHub repository holds all code produced: https://github.com/kgeographer/topotime.
A little context
The tag line for the Pelagios Commons web site is, “Linking the Places of our Past,” and that project is indeed facilitating the linking of historical place attestations published in digital gazetteers. From my perspective (and many others’) , the initiative is going great, bravo!
There are other ways that places are or have been linked and I’ve been plugging away at a facilitating representations and analysis of those connections in a couple of ways. The first was The Orbis Initiative, an ambitious and sadly unsuccessful NSF grant proposal to develop software and systems for extracting information about roads, rivers, canals, railways, and footpaths–and the places connected by them–from the million or so high-quality scans of historical maps. That data is of the physical channels (a.k.a. media, ways) used for the movement of people and goods across the earth surface. Although the grant wasn’t awarded, I’m happy to say a manageably-sized portion of the work it described was taken up by the CIDR team at Stanford University Libraries, just as I was leaving (amicably) in September. I expect fantastic results!
Since that work on geographic networks is in such good hands, I’ve begun to focus on the other side of that coin, the movement over such networks: individual journeys, named historical routes and route systems, and flows. I’m calling the project Linked Places, and a mini-grant from Pelagios Commons has helped to jump-start it. It’s part of my larger DH/GIScience research frame, Topotime, which has a broad goal of joining Place and Period in data stores and software for historical research and education. At the moment, the Linked Places work dominates the Topotime GitHub repository, but I’ll break things out soon.
Enough context, this blog post is intended to describe the status of the Linked Places work products.
Linked Places Phase Two Status
I’ve described the goals of Linked Places and its early results in two blog posts on Pelagios Commons earlier this year (July and October respectively). In Phase One, Lex Berman and Rainer Simon joined me in clarifying a conceptual model for what we wanted to do, refining a provisional spec for a GeoJSON temporal extension (GeoJSON-T), then adapting the GeoJSON-T format for representing route data. We agreed on the term route for an overarching class encompassing journeys, flows, and historical routes and route systems (hRoutes). The conceptual model was then “expressed” in the GeoJSON-T form (Figures 1 and 2).
In Phase Two, I holed up in beautiful Ascoli Piceno to a) convert five exemplar data sets to a generic CSV form, b) write Python scripts to transform that CSV to GeoJSON-T and to populate an ElasticSearch index, and c) build a demo web map application that consumes GeoJSON-T data and puts it through some paces. That app, which mashes up Leaflet/Mapbox map with a Simile Timeline, is not designed as such–it’s been thrown together for discussion about what real apps might be interesting. I will be presenting this now completed Phase 2 work at the Linked Pasts workshop in Madrid, 15-16 December 2016.
Linked Places Work Products
GeoJSON-T simply adds an optional “when” element to native GeoJSON. That “when” is typically placed at the same level as a “geometry” element (the “where”), which can appear in a couple of places: as a top-level attribute of a Feature (Figure 1), or, in the case of routes data, as a member of a GeometryCollection (Figure 2). The GeoJSON GeometryCollection is a relatively infrequently used construct, but is essential to how we represent journeys and hRoutes. There is some more explanation on the Github wiki.
Figure 1. Generic GeoJSON-T Feature, with “when” member in a FeatureCollection (simplified gazetteer record)
Figure 2. Route feature (featureType Journey); segments are geometries in GeometryCollection
I’ve made the assumption that a large proportion of historical route data will be developed in spreadsheet or CSV format natively. Attributes and coding terminology will of course be distinct for every project that develops data. There’s nothing to stop anyone from creating GeoJSON-T route data from scratch, by whatever means, but if a researcher can rearrange their CSV data in a standard form, it can be converted and ingested automatically for use in the existing demo or future GeoJSON-T compatible applications.
At present, one would need to create two CSV files, one for places, and one for route segments. The core fields that are required, but in cases can have null values, are:
[‘collection’, ‘place_id’, ‘toponym’, ‘gazetteer_uri’, ‘gazetteer_label’, ‘lng’, ‘lat’]
[‘collection’, ‘route_id’, ‘segment_id’, ‘source’, ‘target’, ‘label’, ‘geometry’, ‘timespan’, ‘duration’, ‘follows’]
Following these, data files can have any number of further attributes/columns, which will appear in various ways within any given app. A complete accounting of these fields, and further details about data preparation and the Python conversion/ingestion scripts (csvToGeoJSON-T.py and elastic.py) will appear on the GitHub repository wiki soon. If you are anxious to play with this stuff before then (or afterwards), get in touch with me directly.
Linked Places Demo App
The GeoJSON-T format and its implementation for route data allows for some interesting display and analysis possibilities. The app so far only explores the visualization side. I’m planning to follow up this work with at least two “real” applications that do more: one for data exploration and discovery across a large distributed corpus/repository, and a second that allows manipulation and analysis of a given network of geographic movement (e.g. commodity flows like Incanto Trade, or route systems like the Ming Courier Routes). I’ve identified a few other exemplar datasets and welcome inquiries for collaboration.
Load one or more datasets; view linked gazetteer records for places; events or optionally “fuzzy” periods rendered on timeline
Search for Places, identify all members of its “conflation_of” set; and all route segments associated with it, from multiple datasets
Rudimentary timeline visualization (Simile Timeline); timeline and map features are linked
Load places and segments for flows and hRoute systems (nodes and links/edges) into D3 force-directed graph; download GeoJSON-T
View linked Period gazetteer data (from Perio.do)
The results of this work: a conceptual model for routes (journeys, flows and historical routes/route systems), the GeoJSON-T extension, its implementation for route data and reliance on CSV input, and last but not least the map/timeline mashup, are all provisional and experimental. The models have been tweaked (‘refined’) as requirements come to light, and that should continue for at least a little while longer. I welcome comments — here, on twitter (@kgeographer), via the project GitHub repo, or by email: karl[dot]geog[at]gmail[dot]com.