Adding GeoNames to Wikidata for reconciliation

In the upcoming Version 3 beta of World Historical Gazetteer (early June 2024), we have added about 10 million GeoNames place records to the 3.6 million Wikidata records in the index we have been using for reconciliation. This means that for geocoding purposes (one of the main reasons for using the WHG reconciliation service) the will be a higher likelihood of finding prospective matches for your records.

It also adds some complexity to the review of hits (see the screen below) and we are looking for feedback on how this will work. During the beta phase we can refine or even discard this feature – up to our users!

So…how it will work:

  • When you create a new reconciliation task you have the option to exclude GeoNames records; if you do, they will be skipped in the search for matches
  • If you don’t exclude them, hits from GeoNames will be returned along with those from Wikidata, but…
    • If there were were both Wikidata and GeoNames hits, the GeoNames ones would be hidden initially, but displayed on click of a toggle button
    • If there were no Wikidata hits but there were GeoNames hits, those would display right away.
  • As usual, you can select zero or more of these hits as close matches, press Save and move on to the next.

Below you can see the before and after choosing to display GeoNames hits.

Wikidata hit shown, GeoNames hidden until requested
GeoNames hits displayed on request

Version 3 due in June!

We have been busy, with both software and content development. Version 3 of the World Historical Gazetteer has been in development since February 2023, and a beta version will be available mid-2024. What follows is a brief outline of what we have been working on, much of which came as suggestions from our user community. Details will follow in the coming weeks and months, on this blog and on Twitter. We do expect to establish a Mastodon account soon as well.

Version 3 (alpha) home page

New “Gazetteer Builder” feature

  • Link multiple datasets in a single collection, e.g. for a group or individual to assemble a “Historical Gazetteer of {x}”
  • Merge multiple datasets into new dataset

Home page

  • A map(!), with search and advanced search
  • ‘Carousels’ of published datasets and collections, with extents previewed on the map
  • Improved explanation of what the WHG offers
  • News and announcements

Maps

  • All 14 maps on the site significantly upgraded
  • Most maps now have temporal controls: a timespan ‘slider’ and/or a sequence ‘player’
  • Faster display of large datasets and collections, thanks to WHG’s own new “tileboss” server

Search

  • Search now across all published records-the confusing “search the index or database” choice is gone!
  • Options for ‘starts with”, “contains”, “similar to” (aka fuzzy) as well as ‘exact’
  • Spatial filter on search results
  • More information returned in search result items

Place Portal pages

  • Complete makeover of its design
  • Physical geographic context: ecoregions, watersheds, rivers, boundaries
  • Nearby places
  • Preview of annotated collections that include the place

Publication and editorial workflow

  • We are now especially highlighting three types of publications: Datasets, annotated Place Collections, and Dataset Collections
  • Expanded Managing Editor role
  • Improved tracking of contributors and data, from ‘interested’ to full accessioning
  • DOIs for data publications, enhanced metadata, significantly enhanced presentation pages
  • Improved download options

Annotated place collections for teaching

  • Support for class and workshop group scenarios
  • Optional image per annotation
  • Order places sequentially with or without dates
  • Enhanced display and temporal control options
  • Optional gallery per class
  • Site-wide student gallery

“My Data” dashboard and profile

  • Single page, simpler

Study Areas

  • Discontiguous areas, e.g. Iberian peninsula and S. America as a single area

API and data dumps

  • More endpoints, better documented
  • Regular dumps of published data in multiple formats

Codebase

  • Improved file upload validation and error reporting
  • The codebase is now “dockerized,” making it much easier to contribute to the platform’s development
  • Upgraded versions of all major components: Django, PostgreSQL, Elasticsearch, etc.
  • All map-related functions refactored for efficiency

Connecting Places with World Historical Gazetteer

On 13 September, I gave an invited talk, titled “Connecting Places with World Historical Gazetteer” at the Royal Dutch Academies Humanities Cluster offices in Amsterdam (KNAW-HuC). The slides are provided here, with some annotation.

Linked Pasts VII activity: Reconciliation of Historical Place Names

We are pleased to be co-sponsoring with several colleagues a workshop/activity at the upcoming Linked Pasts VII Symposium. The activity will take place in two two-hour sessions on Wednesday and Thursday, the 15th and 16th of December (both starting at 16:00 CET). If you would like to participate, and/or be kept informed of details as the dates approach, please register your interest in this online form so we can be in touch with you.

Conveners: Tomasz Panecki, Bogumil Szady, Grzegorz Myrda (Polish HGIS); Karl Grossner, Ruth Mostern (World Historical Gazetteer)

Discussants: Ruth Mostern, Sinai Rusinek, Humphrey Southall; Merve Tekgürler

Reconciliation in the Linked Pasts context is the task of aligning records concerning named historical entities contained in one dataset with those of another, typically for places or people. Often it is a research dataset being reconciled against some authoritative resource, but sometimes aligning “peer” datasets is the goal. We perform reconciliation in order to augment our dataset with attributes gained from another (e.g. geographic coordinates and concordance identifiers in the case of places), and to link our records (and by extension our research) with that of colleagues concerned with the same places and people.

The conveners of this activity have identified three particular issues we are interested to focus on for place data, in discussion and in a related exercise. The first is the quality of reference datasets (e.g. Wikidata/DBpedia, TGN, etc.) and how their granularity influences the process of alignment. The second issue is the consistency of matching decisions between multiple reviewers. Although algorithms may help users to match place names with their modern counterparts, choices are often ambiguous. The third issue is the sustainability of results and whether such linked data products can be treated as reference works, which is associated with the problem of how the reconciliation of place names relates to the identification of geographic phenomena over time.

The activity will begin with 10-minute presentations from four discussants who have various perspectives on reconciliation from their own work. Their geographic areas of concern include historical Polish territories, the broad Middle East, Russia, and the UK. Following that, in a demo/exercise participants will jointly perform a reconciliation exercise on a portion of a 100-row example dataset assembled for this purpose, using the World Historical Gazetteer platform. Afterwards, free discussion will close out the first session. Participants will be asked to upload the sample data file into their own private space in WHG in between sessions, and perform the reconciliation task on all 100 records (this should take 30 minutes or less).

In the second session, we’ll review those records that had the most disagreement between reviewers in matching results, then have extended discussion amongst all participants about reconciliation in general and possible next steps. We anticipate the experience will foster interesting and useful discussion that will inform a best practices white paper to be co-authored by the conveners following Linked  Pasts 7.

Version 2 is here!

We are pleased to announce the release of Version 2 of World Historical Gazetteer! New features have been added, and we’ve made several significant improvements to usability.  This work was made possible by the continued support of our home institution, the World History Center at the University of Pittsburgh, and especially by the collaboration and support of the Humanities Cluster of KNAW.

In recent months we asked several contributors to pause their data preparation in the WHG system while these improvements were made. We can finally “re-open the doors” so to speak,  so we invite those efforts to resume, and again encourage new contributions and collaborations. We will respond quickly to any bug reports or general inquiries about using the platform.

Over the next several months we will be adding quite a bit more data that is already in the queue. Although much user interaction with the WHG platform is self-guided and semi-automated, we have found that contributions move most smoothly with staff support. WHG staff stand ready to help with data conversion strategies and with the planning of contributions generally. Please do get in touch with us (whg at pitt dot edu) or with any individual WHG project team members individually.

What’s New

The Site Guide and several tutorials on the WHG site describe its features and their use in some detail. The following briefly summarizes what is new since Version 1.

Collections

Registered users can now create “collections,” linking sets of existing public datasets within the system for purposes of presentation and combined search. This new feature aims at supporting the development of “focus regions” within WHG by collaborative groups with overlapping region/period interests.

Revamped search

Previously, search capability was limited to records fully accessioned into the WHG “union index,” and returned sets of one or more “closely matched” attestations of a place. This kept from view public datasets that had not yet been indexed. An option to search all public data within the WHG database—indexed or not—has been added to give a more complete view of the data we hold.

Reconciliation review

We have adopted the term “linking” to refer to all tasks of reconciliation and alignment—to external the external sources Wikidata, Getty TGN held in our sytem and to our own WHG union index. All of these require a “Review” step, where the prospective matches discovered in the task are presented for closeMatch/no match/defer decisions. The progress of this process, which can sometimes extend over time and involve multiple people, is now tracked in the Dataset Browse screen available to the dataset contibutor (“owner”) and designated collaborators. The choice to “defer” is also new since v1.2; it permits maintaining a separate queue of records, allowing users to move quickly through the easier decisions and set aside those requiring more attention, or review by others.

Views and downloads of public data

We now provide summary descriptions and mapped browsing for all datasets, collections, and individual place records that have been flagged as public. Public datasets can now be downloaded, according to CC-BY-4.0 license terms.

Faster maps

We have implemented the MapLibreGL technology for our Dataset and Collection maps, dramatically enhancing the speed of rendering large numbers of features.

Local Wikidata index

Since Version 1.2 in May, we have maintained a local index of about 3.6 million Wikidata place records, making reconciliation tasks for that resource 3x faster than the earlier SPARQL queries over the web–processing about 150-180 records per minute.

More reliable upload validation

Accounting for every possible anomaly or error in upload files is tricky. We have significantly improved the validation algorithm, trapping more errors with more user-friendly responses.

Miscellaneous

Site documentation has been edited and extended, and a number of display problems were fixed. SSL protocol (https) has been implemented for secure transfer.

 

Version 1 Has Launched

After three years of development, we are pleased to announce the launch of Version 1 of the World Historical Gazetteer (WHG), at whgazetteer.org. Version 1 follows six beta releases over the past year or so. The WHG presently indexes 1.8 million modern place references and approximately 60,000 temporally scoped records.

In addition to filtered search and API access to data, we have developed a suite of tools that allow you to upload place datasets into a private workspace, augment them by reconciling them against the Getty Thesaurus of Geographical Names and Wikidata, publish them as Linked Open Data, and contribute them for accessioning to the WHG index.

We  have a long list of planned improvements and a queue of in-progress data contributions. More data is very welcome of course, and your feedback is essential! You can use the site contact form, create an issue on GitHub, or simply write to us at whg@pitt.edu.

We have completed a Site Guide that describes the purposes, functionality, and data of the present system.  The Tutorials and About sections of the site provide additional information. We will continue keeping our nearly 500 Twitter followers current with news of new data, new features, and bug fixes. We also plan to keep our blog updated with relevant announcements and discussion of the project’s future course..

We are pleased to announce this major step in our project and we look forward to your

The World Historical Gazetteer Team
Ruth Mostern, Karl Grossner, and Susan Grunewald

Linked [Places, Traces, Art, …]

This post aims to clarify the relationships between a few of the models now in development for various uses by members of the historical linked data community, particularly with regard to geography (place)—namely Linked Places, Linked Traces, and Linked Art. Figure 1 provides a conceptual overview (click to magnify).

Figure 1 – Place in Linked Places, Linked Art, and Linked Traces conceptual models

Background

In several key respects the World Historical Gazetteer project (WHG; now in beta release 0.3) builds upon software and data development work produced by the Pelagios project—particularly the historical gazetteer infrastructure underlying its Peripleo and Recogito software applications.

Peripleo is a pilot application (no longer in active development) built to demonstrate a few key linked-data-for-history functions: a) search of a central index aggregating historical gazetteer records published as Linked Data, b) the annnotation of web-published records about historical objects with identifiers for relevant places (mostly coins and inscriptions in this case), and c) the display of search results for both in a map interface. WHG performs those functions also, along with some others.

Recogito is an annotation platform that among other things makes use of that historical gazetteer index by facilitating association of place references tagged in textual sources with the identifiers, coordinates, and name variants found in the indexed gazetteer records.

I have collaborated with Pelagios developer Rainer Simon and a few other interested folks to develop a Linked Places model and format particularly for contributions to the Pelagios and WHG platforms. The Pelagios and WHG indexes will have considerable overlap in coverage, but we anticipate that of WHG will over time become considerably broader in space and time—due primarily to its built-in semi-automated data development and contribution pipeline and stated goal of global breadth.

Because both projects have interest in annotations, we have also begun jointly developing a Linked Traces format—more precisely a set of implementation patterns using the W3C Web Annotation format standard for digital history and GLAM applications.

With that introduction, what follows are some details about Linked Places and Linked Traces, and thoughts about their immediate and potential uses. Also, given the concurrent development of the Linked Art model and ontology, some thoughts about how all of these might in time relate to each other in practice. Figure 1 above should provide useful reference.

Linked Places: model and data format

The Linked Places model and interconnection format (LPF) were developed to meet the particular requirements of the WHG and Pelagios platforms: a common data structure that both could ingest routinely without the need to accommodate on a case-by-case basis the enormous variety of data models in use by digital historical projects, large and small. LPF is a set of extensions to GeoJSON-LD, itself a Linked Data enabling extension to the most widely implmented test-based format for representing geographic features, GeoJSON (an IETF standard).

LPF also adds a standard means for adding time to GeoJSON features, introducing “when” objects to permit temporally scoping of a) an entire Feature, and/or b) its individual names, place types, geometry, and relations to other places, in any combination.

Uploads to WHG (and accessioning to both Pelagios and WHG indexes) require creating a serialization (i.e. transforming export) of place data from whatever form it is maintained in to LPF. We have also developed an abbreviated delimited text file format (LP-TSV) to meet the needs of contributors with relatively simple records.

Figure 1 summarizes the LPF conceptual model.

Linked Art: model and format

A global consortium of organizations involved in the domains of art, cultural heritage and archaeology—principally large museums and universities—are jointly developing Linked Art, a “shared model based on Linked Open Data to describe Art,” along with software implementations of it. The conceptual model is being formalized in an ontology with a subset of CIDOC-CRM entities and relations, and expressed as a data format using JSON-LD, a syntax of RDF.

From the perspective of WHG, Linked Art is a format many prospective users of our platform may adopt to describe objects in their collections. Both WHG and Pelagios are agnostic as to what formats our users and data partners use, and as mentioned above, users will have to perform a serialization to Linked Places format to interact with our platforms.

Figure 1 shows how Place appears in the Linked Art model. The points of contact with Linked Places are identifiers. One kind of identifier in Linked Art is a URI to a linked data gazetteer resource. A serialization of Places from a Linked Art dataset to LPF should include as many such identifiers as can be managed. WHG can aid discovery of those URIs via reconciliation services to Getty TGN and Wikidata.

All that said, place data from Linked Art collections are unlikely to be good candidates for contributions to WHG; the great majority of places will already be indexed. Rather, it is Linked Traces data that will be more relevant.

Linked Traces: model and format

WHG is following on from the Peripleo pilot in experimentally indexing not only place data, but what we are calling trace data: “annotations of web-published records about historical objects with identifiers for relevant places.” We say “experimentally” because it seems likely that the most useful web interfaces to trace data will be distinct from those for place data. Certainly there will be significant scaling issues.

In order to continue exploring the linking of places and associated traces, we (Rainer Simon and I) have also initiated development of a Linked Traces format, as a potential standard for use by the WHG and Recogito platforms. Linked Traces is turning out to be a set of implementation patterns for the W3C Web Annotation format (WA).

Annotating records of “anything” with URIs for web-published place records is but one use case for WA. For example, in Recogito, users annotate texts with references to not only places, but also people, events, and relations between all three.

Figure 1 indicates the way that a set of one or more place records can form the body of an annotation. The JSON form of the body in that example corresponds to an early draft of a “Linked Traces place pattern” in development. The working group’s activities are paused at the moment, but WHG is developing some exemplar data according to that draft, to be explored in our Version 1 release, slated for late spring 2020.

Beta Release v0.2

At long last we are ready to offer a v0.2 beta release of the World Historical Gazetteer (WHG) at http://dev.whgazetteer.org. We hope that spatial historians and spatio-temporal infrastructure developers will be interested in taking a look at what we are building, experimenting with their data or provided samples. It is a “sandbox,” so nothing will be saved for the time being (that will change soon). There are 5-6 months remaining in the term of our initial NEH grant, time enough to complete most of what we planned for this phase, and to incorporate more suggestions from users and potential contributors as we move toward future planning and development.

The site includes a brief guide titled “WHG Beta Release: A Tour,” which outlines what is there, what you can do and how, remaining challenges, and what is in the works. What follows is a higher level introduction.

Places and Traces

The World Historical Gazetteer is a Linked Open Data platform for publishing, linking, discovering, and visualizing contributed records of attested historical places and traces. Our initial focus has been on places, but we are working experimentally to demonstrate their integration within the platform with what we now call traces–defined as web resources about historical entities for which location in time and space is of scholarly and general interest. We are considering three classes of traces for the time being: agents (people and groups), works (e.g. artifacts, texts, datasets), and events (e.g. journeys, conflict). Our objective has been to create the first large-scale spatial infrastructure for world history: oriented toward documenting the human past at the global scale, and particularly the geography of global and transregional connections.

Our accessioning process is intended to eventually be largely self-directed; getting it to that stage means working directly and hands-on with our early contributors.

LOD Publication

Registered users of WHG can publish their place records as Linked Open Data simply by uploading them in Linked Places format (or the LP-TSV version intended for relatively simpler records). We see LOD publication as a key feature for researchers who are not in a position to stand up their own web interfaces with per-place pages. Once uploaded, each record will have a permanent URI and be accessible in our graphical interface and API; on their way to being LOD in good standing. The dataset can be browsed immediately by its owner in a searchable table and map, but turning the uploaded dataset into a contribution for accessioning requires some further steps. The data needs to have as many asserted links to name authorities as possible, and augmentations of geometry where that is missing and findable. We provide reconciliation services for that purpose.

Reconciliation

Simply put, reconciliation is the process of identifying matches between records of named entities. In this case the records are for places, and the matches are between a researcher’s records and those in existing place name authorities. So far, we provide reconciliation services for the Getty Thesaurus of Geographic Names (TGN) and Wikidata; DBpedia and GeoNames are planned.

The reconciliation process has two steps: 1) sending records to the authority, and 2) reviewing the prospective matches returned and accepting or declining them as appropriate. The results of this somewhat laborious process are 1) links, and 2) more geometry. Once augmented in this way, a dataset is ready for accessioning.

Accessioning

The last step is another reconciliation effort — this time to the WHG index. Each record is compared to the growing WHG index to determine if we have a contributed attestation for the place yet or not. If we do, the incoming record becomes a “child” or “leaf” in the set of attestations for the place. If the place is not yet accounted for, the new record becomes a “parent” — the seed for a new set of attestations. At this stage, an automatic linking can be made if two records share an authority match, but the rest will have to be reviewed as described above.

Graphical Interface

The opening screen of WHG offers users search of places and traces. We try to offer enough context on the opening screen to identify the likeliest match. Once you identify a place of interest, clicking its name take you to a “place portal” screen–where everything we have about the place, or linked to it in some way, will appear: attestations from contributors, associated traces, nearby places, physical geographic context (rivers, watersheds, ecoregions). The place portal is very much a work-in-progress at this stage. Several other features are also on our near-term to do list, including advanced search; more and better maps; user data collections; project team ‘workspace’; batching of reconciliation tasks; and more.

A Word About Architecture

There are two data stores within the WHG platform: a relational database (PostgreSQL) and a high-speed index (Elasticsearch). All uploaded data gets imported to a set of relational tables whose names correspond to the elements of Linked Places format: places, place_name, place_type, place_geom, place_link, place_when, place_related, place_description, and place_depiction. Contributed data is most readily managed in that form. Upon accessioning, records are added to the index in the manner described under Reconciliation above.

An API

This part of the WHG platform is one of the most important, and the least developed right now. Stay tuned for further developments. Our intention is to provide access to both contributors’ individual records and datasets from the database (when designated by their owner as public), and to the aggregating index records; both with numerous and useful filtering capabilities.

Content

Our index has been instantiated with records from modern gazetteer resources: 1) about 1,000 of the world’s most populous cites from GeoNames, 2) ~1.8 million place records from Getty TGN, 3) about 1,500 societies from the D-Place anthropological repository; and 4) major rivers, lakes, and mountain ranges from Natural Earth and Wildlife Research Institute.

To this modern “core” we have begun adding historical data: 1) 10,600 entities harvested from the index of the Atlas of World History (Dorling Kindersley, 1995), offering broad but shallow global coverage; and 2) our first specialist gazetteer, HGIS de las Indias, which consists of approximately 15,000 settlements and territories in colonial Latin America. There are several additional large datasets in the queue, which we will be adding in partnership with contributors. Some are previewed as heat maps on our Maps page.

Broad coverage of modern names with increasing historical depth and connections supplied by trace data.

Our Pelagios Connections

The WHG platform borrows extensively from the Peripleo application developed by Rainer Simon of the Pelagios project, extending it significantly in a few ways. Our backend architecture closely mimics that underlying both Peripleo and the Recogito annotation tool, and we are actively collaborating with Rainer and the entire Pelagios Network team on several aspects of this work. In particular, we are co-developing the data format standards for contributions to both systems: Linked Places format, and a nascent Linked Traces annotation format.

Feedback

We welcome suggestions, critiques, even praise :^) – and there is an email form on the site which makes it easy to offer it. Please bear with us in this active development stage and check back as we realize the system’s potential more fully over the next several months. Look for further blog posts and follow us on Twitter; we tweet progress and related information as @WHGazetteer and @kgeographer.

 

Linked Traces progress

As described in our last post, the World-Historical Gazetteer (WHG) and Pelagios projects have adopted the term “trace” to refer to historical entities for which there is spatial-temporal data of interest, including events, people, works, and other artifacts. Following the lead of Pelagios’ Peripleo, the WHG system (initial beta release July 2019) will index contributed trace data, linking them to places in an underlying knowledge graph that is a) navigable in graphical features, and b) queryable in an API.

We (WHG and Pelagios) have set out to create a standard Linked Traces data format (LTAnno), which will take the form of W3C web annotations. We welcome (need, actually) active collaboration in that modeling task, and feedback from interested observers.

An LTAnno target is an LOD web-published record of some entity, and its body contains a) a place record URI, b) a relation between the place and the target entity, and c) an optional temporal scoping for each. It should be possible to have multiple bodies per target (per example below) and multiple targets per body (e.g several people having the same birthplace, several works having the same place as subject, and so on).

A Trace Data Example

WHG will fully support LTanno format, and likely focus on a few types of trace data, including those related to geographic movement such as journeys and cultural diffusion. Figures 1 and 2 illustrate a test Journey record using the draft LTAnno format. In it, 38 annotation bodies referring to WHG place URIs are linked to a single target, the WorldCat record for the source of the waypoints, “Xuanzang: a Buddhist pilgrim on the Silk Road” (Wriggins, 1997). A user finding their way to any of the 38 places will learn they were waypoints on the journey, be able to see the others, and to navigate to their respective place pages. This only scratches the surface of what will be possible, given a growing volume of trace data for other events, people, and works.

Figure 1 – Portal page for Bamiyan in World-Historical Gazetteer (in development)

Figure 2 – Each waypoint for the Journey trace is linked to its own place portal

Next Steps

The draft examples of LTAnno for different trace types are only preliminary, for discussion. In the coming weeks, Rainer Simon and I will coordinate development of a spec our respective projects can support. There is a Google Group and email list for this working geoup, and we will go into further detail about the draft spec there shortly. Active collaboration by data modelers, data providers, and future users of the format is most welcome.

One of the first orders of business is gathering a few sample datasets for different types of traces, in order to better understand the variety of modeling circumstances. These will then have to be converted into early versions of the format to test usability and usefulness.

At a later stage, we’ll have to put together a simple Linked Pasts ontology describing terms introduced by both this new LTAnno format and the recently developed Linked Places format for gazetteer data connectivity.

Linked Traces

Linked Pasts is an annual symposium. Linked.Art and Linked Places are data models with associated format specifications. Can we manage one more Linked something? Rainer Simon and I have begun an initiative to develop a Linked Traces annotation model and file format as a standard for contributions to linked open data aggregation projects such as World-Historical Gazetteer and Pelagios. The effort could easily extend to software and systems for displaying, searching, and analyzing trace data. The idea has drawn considerable interest, so here are some thoughts to start a discussion…and action.

What is a trace?

For our purposes a trace is any historical entity having a spatial-temporal setting (i.e. footprint) of interest—very general! The types of traces we’re immediately focused on include: people and groups of people, events of any complexity, and artifacts of all kinds (e.g. objects, texts, art works).

What is trace data?

Trace data are annotations of web-published records about (and images of) trace entities. We posit here that the body of a trace annotation must include a place reference (URI and name/title), should include a relation (e.g. waypoint, findspot, birthplace), and could include a temporal scope for that relation. Properties like creator and date are musts also. Trace data should take the form spelled out in the W3C Web Annotation Model and Vocabulary, in the JSON-LD syntax of RDF. Draft examples of some trace annotations have been posted in a GitHub repository for discussion. There are a few outstanding issues that need community consensus to resolve, outlined below.

Why trace data?

The Peripleo pilot application launched a few years ago by the Pelagios project is an example of traces in action. Underlying Peripleo is an index of a) place records aggregated from multiple gazetteers, and b) what we are now calling trace data: annotations of records about ancient coins, coin hoards, and inscriptions with relevant locations such as find spots.

There are many other kinds of things associated with places—at times or during periods—we might like to see, compare, and analyze as elements of “deep” linked data place records in future Peripleo-like software (e.g. World-Historical Gazetteer, now in development). For a given place, discover not only what museum artifacts or inscriptions were found there, but what historical persons are associated with the place, and in what way; what journeys of exploration or pilgrimage it was a waypoint on; and what texts and art works it is a subject of.

We have already heard from people with Person and Event data, and Rainer notes that this should support annotations of IIIF-formatted manuscripts and other images.

One sample

Here is one sample draft annotation record for a Journey event. As mentioned, more examples are on GitHub.

{ "@context":[
    "http://www.w3.org/ns/anno.jsonld",
    { "lpo": "http://linkedpasts.org/ontology/lpo.jsonld"}
  ],
  "id": "http://my.org/annotations/92837",
  "type": "Annotation",
  "creator": {
    "id":"http://example.org/people/2345",
    "name":"Ima Tracemaker",
    "homepage":"http://tracemaker.org"},
  "created": "2019-03-18",
  "motivation": "linking",
  "body": [
    {"id": "http://whgazetteer.org/places/86880",
     "dc:title": "Tashkent",
     "lpo:relation": "lpo:waypoint",
     "lpo:when": {"timespans":[
       {"start":{"in":"630"},"end":{"in":"630"}}]}
    },
    {"id": "http://whgazetteer.org/places/84774",
     "dc:title": "Mathura",
     "lpo:relation": "lpo:waypoint",
     "lpo:when": {"timespans":[
        {"start":{"in":"634"},"end":{"in":"634"}}]}
    },
   // ... etc.
 ],
 "target": {
   "id": "http://my.org/events/90001",
   "type": "lpo:Journey",
   "dc:title" "Pilgrimage of Xuanzang"
 }
}

Open questions

The next step is for a working group to collectively answer existing open questions, and to surface (and answer) questions we haven’t thought of. We welcome collaborators and observers. A few questions that came to mind while developing the prospective samples:

  1. What are the types of traces (annotation targets)?
  2. Should there be a vocabulary of type-specific relations? E.g. waypoint for Journey traces, or birthplace for Persons.
  3. How can bodies (place/time assertions) be combined as sequences in sets for a given target? E.g. Journey waypoints.
  4. How can relations be combined in sets? E.g. a Place was both a birthplace and deathplace for a Person.
  5. Where can “when” be expressed in an annotation? E.g. in #4, can a date be associated with each relation to the Place?
  6. How should “extension” terms we introduce (and allowed by the W3C spec) be defined? In a “Linked Pasts Ontology”? What will be its contents?

Undoubtedly more will surface.

Next steps

I’ve created a Google Group email list for tracking conversation amongst collaborators and observers, and posted parts of this document as an editable Google Doc. After some initial feedback perhaps we should have a Google Hangout. (My plan to begin extracting Google from my life is not going well!) Suggestions for other tools and platforms are welcome.