Linking Linked Places: Modeling Historical Movement
I’m very pleased to have been awarded, with colleagues Lex Berman (Harvard Center for Geographic Analysis) and Rainer Simon (Pelagios; Austrian Institute of Technology), a Pelagios Commons Resource Development Grant supporting a project we’ve titled Linking Linked Places. The following briefly describes the motivation for this work and the concrete products we plan to present at the Linked Pasts workshop in Madrid this December.
Many of the places recorded in the growing number of Pelagios-compatible historical gazetteers have been stops on journeys and trade routes, as well as sources and sinks for flows of commodities, migrants, and information. Just as users of Peripleo can now discover items associated with a place that are catalogued in multiple linked datasets, it would be worthwhile to enable discovery of attested journeys, named historical routes, and flows for which that place was a waypoint.
Researchers studying historical movement across the landscape require a robust and flexible framework for representing such phenomena, a relatively simpler interconnection format for sharing core elements of their data, and tools to assist data creation and transformations. Given these, larger regional and global network models can be assembled, and many world-historical studies that are now difficult or impossible would be greatly facilitated.
Over the next 4 ½ months we will be developing two such network data formats and a simple experimental interface to facilitate the creation and browsing of data for attested journeys, routes and flows. Our intent is that they complement, and be fully compatible with, Pelagios project technologies. Development precedes and follows on from a 3-day working meeting at the Center for Geographic Analysis at Harvard University (September 19-21).
The historical gazetteers now linked by Pelagios do a fine job of representing places. A natural next step is develop one or more simple, flexible models and formats for representing some of the relationships between places—enabling us to ask, for a given place, what other places have been connected to it throughout history, and in what way. This type of data is different in kind from place data in that it largely concerns occurrences: individual movement events and their aggregation as flows.
Data about such movement activity, as well as associated ways—the physical channels for that movement (roads, rivers, canals, footpaths, railroads)—are usefully abstracted in the conceptual model of networks and their component nodes and edges. Nodes correspond to places and edges any sort of link between them.
Nodes can refer to places of any type or size: from archaeological sites to settlements, to regions and administrative areas like provinces and countries. The locations of places found in gazetteers are usually described by point, line, or polygon geometries, but they can also be unknown or estimated. Gazetteers handle places.
Edges can refer to either movement events and flows between places, or to the physical channels for that activity. The figure below illustrates a few categories of node-edge network data; the accompanying table shows how edge and node tables together can encapsulate information about journeys, routes, flows, and ways. Note that the geometry of edges, just as with nodes, might be known or unknown, estimated or ignored.
So if virtually all data for historical movement activity and the ways it traversed can be represented as collections of nodes and edges, and node identifiers can simply point to gazetteer records for the corresponding places, what is lacking? The answers should follow from use cases for the operations we might wish to perform with the data: link it, conflate it, browse it, visualize it, analyze it. We need a few standard formats for representing the data (apart from edge and node tables), and software for performing those operations that can interpret our standard formats.
GeoJSON, the flexible and widely supported data format standard for human-readable representations of geographic features makes an excellent starting point. However, journeys, trade routes, and flows are not so much geographic features as they are geographic phenomena—they are essentially temporal. Topotime is a project I’ve been working at for a couple of years aimed at extending GeoJSON by adding a temporal “when” element to it, one capable of representing some aspects of uncertainty. I’m calling this GeoJSON-T. Early experiments with several datasets indicate that with some refinements, it may serve as a useful highly flexible standard for journeys, historical routes, and flows. Under this grant we will further test that assumption.
Linked Places Interconnection Format (LPIF)
Whereas GeoJSON-T will (hopefully) serve many projects having elaborate data modeling requirements, in order to facilitate wide-spread sharing and linking of historical movement data we need a simpler representation of core entities and relations, functionally comparable to the Pelagios interconnection format used for linking gazetteers. As noted, the simplest abstraction common to all such data is that of segments composed of nodes and edges. We know the edges should have temporal attributes, but what else? Under this grant, Rainer, Lex and I will develop a candidate format (with a small prototype implementation) for consideration by the Pelagios community.
 This domain involves many related concepts and terms with overlapping meanings. In the O.E.D, a way refers to “A track prepared or available for travelling along; a road, a lane, a path.” It is also the preferred term used by OpenStreetMap for any kind of network segment.
 It is worth noting that some essentially temporal constructs, such as historical periods, have equally important spatial attributes. It is likely that GeoJSON-T can be effectively used for those data as well.