Linked Traces progress

As described in our last post, the World-Historical Gazetteer (WHG) and Pelagios projects have adopted the term “trace” to refer to historical entities for which there is spatial-temporal data of interest, including events, people, works, and other artifacts. Following the lead of Pelagios’ Peripleo, the WHG system (initial beta release July 2019) will index contributed trace data, linking them to places in an underlying knowledge graph that is a) navigable in graphical features, and b) queryable in an API.

We (WHG and Pelagios) have set out to create a standard Linked Traces data format (LTAnno), which will take the form of W3C web annotations. We welcome (need, actually) active collaboration in that modeling task, and feedback from interested observers.

An LTAnno target is an LOD web-published record of some entity, and its body contains a) a place record URI, b) a relation between the place and the target entity, and c) an optional temporal scoping for each. It should be possible to have multiple bodies per target (per example below) and multiple targets per body (e.g several people having the same birthplace, several works having the same place as subject, and so on).

A Trace Data Example

WHG will fully support LTanno format, and likely focus on a few types of trace data, including those related to geographic movement such as journeys and cultural diffusion. Figures 1 and 2 illustrate a test Journey record using the draft LTAnno format. In it, 38 annotation bodies referring to WHG place URIs are linked to a single target, the WorldCat record for the source of the waypoints, “Xuanzang: a Buddhist pilgrim on the Silk Road” (Wriggins, 1997). A user finding their way to any of the 38 places will learn they were waypoints on the journey, be able to see the others, and to navigate to their respective place pages. This only scratches the surface of what will be possible, given a growing volume of trace data for other events, people, and works.

Figure 1 – Portal page for Bamiyan in World-Historical Gazetteer (in development)
Figure 2 – Each waypoint for the Journey trace is linked to its own place portal

Next Steps

The draft examples of LTAnno for different trace types are only preliminary, for discussion. In the coming weeks, Rainer Simon and I will coordinate development of a spec our respective projects can support. There is a Google Group and email list for this working geoup, and we will go into further detail about the draft spec there shortly. Active collaboration by data modelers, data providers, and future users of the format is most welcome.

One of the first orders of business is gathering a few sample datasets for different types of traces, in order to better understand the variety of modeling circumstances. These will then have to be converted into early versions of the format to test usability and usefulness.

At a later stage, we’ll have to put together a simple Linked Pasts ontology describing terms introduced by both this new LTAnno format and the recently developed Linked Places format for gazetteer data connectivity.

Linked Traces

Linked Pasts is an annual symposium. Linked.Art and Linked Places are data models with associated format specifications. Can we manage one more Linked something? Rainer Simon and I have begun an initiative to develop a Linked Traces annotation model and file format as a standard for contributions to linked open data aggregation projects such as World-Historical Gazetteer and Pelagios. The effort could easily extend to software and systems for displaying, searching, and analyzing trace data. The idea has drawn considerable interest, so here are some thoughts to start a discussion…and action.

What is a trace?

For our purposes a trace is any historical entity having a spatial-temporal setting (i.e. footprint) of interest—very general! The types of traces we’re immediately focused on include: people and groups of people, events of any complexity, and artifacts of all kinds (e.g. objects, texts, art works).

What is trace data?

Trace data are annotations of web-published records about (and images of) trace entities. We posit here that the body of a trace annotation must include a place reference (URI and name/title), should include a relation (e.g. waypoint, findspot, birthplace), and could include a temporal scope for that relation. Properties like creator and date are musts also. Trace data should take the form spelled out in the W3C Web Annotation Model and Vocabulary, in the JSON-LD syntax of RDF. Draft examples of some trace annotations have been posted in a GitHub repository for discussion. There are a few outstanding issues that need community consensus to resolve, outlined below.

Why trace data?

The Peripleo pilot application launched a few years ago by the Pelagios project is an example of traces in action. Underlying Peripleo is an index of a) place records aggregated from multiple gazetteers, and b) what we are now calling trace data: annotations of records about ancient coins, coin hoards, and inscriptions with relevant locations such as find spots.

There are many other kinds of things associated with places—at times or during periods—we might like to see, compare, and analyze as elements of “deep” linked data place records in future Peripleo-like software (e.g. World-Historical Gazetteer, now in development). For a given place, discover not only what museum artifacts or inscriptions were found there, but what historical persons are associated with the place, and in what way; what journeys of exploration or pilgrimage it was a waypoint on; and what texts and art works it is a subject of.

We have already heard from people with Person and Event data, and Rainer notes that this should support annotations of IIIF-formatted manuscripts and other images.

One sample

Here is one sample draft annotation record for a Journey event. As mentioned, more examples are on GitHub.

{ "@context":[
    "http://www.w3.org/ns/anno.jsonld",
    { "lpo": "http://linkedpasts.org/ontology/lpo.jsonld"}
  ],
  "id": "http://my.org/annotations/92837",
  "type": "Annotation",
  "creator": {
    "id":"http://example.org/people/2345",
    "name":"Ima Tracemaker",
    "homepage":"http://tracemaker.org"},
  "created": "2019-03-18",
  "motivation": "linking",
  "body": [
    {"id": "http://whgazetteer.org/places/86880",
     "dc:title": "Tashkent",
     "lpo:relation": "lpo:waypoint",
     "lpo:when": {"timespans":[
       {"start":{"in":"630"},"end":{"in":"630"}}]}
    },
    {"id": "http://whgazetteer.org/places/84774",
     "dc:title": "Mathura",
     "lpo:relation": "lpo:waypoint",
     "lpo:when": {"timespans":[
        {"start":{"in":"634"},"end":{"in":"634"}}]}
    },
   // ... etc.
 ],
 "target": {
   "id": "http://my.org/events/90001",
   "type": "lpo:Journey",
   "dc:title" "Pilgrimage of Xuanzang"
 }
}

Open questions

The next step is for a working group to collectively answer existing open questions, and to surface (and answer) questions we haven’t thought of. We welcome collaborators and observers. A few questions that came to mind while developing the prospective samples:

  1. What are the types of traces (annotation targets)?
  2. Should there be a vocabulary of type-specific relations? E.g. waypoint for Journey traces, or birthplace for Persons.
  3. How can bodies (place/time assertions) be combined as sequences in sets for a given target? E.g. Journey waypoints.
  4. How can relations be combined in sets? E.g. a Place was both a birthplace and deathplace for a Person.
  5. Where can “when” be expressed in an annotation? E.g. in #4, can a date be associated with each relation to the Place?
  6. How should “extension” terms we introduce (and allowed by the W3C spec) be defined? In a “Linked Pasts Ontology”? What will be its contents?

Undoubtedly more will surface.

Next steps

I’ve created a Google Group email list for tracking conversation amongst collaborators and observers, and posted parts of this document as an editable Google Doc. After some initial feedback perhaps we should have a Google Hangout. (My plan to begin extracting Google from my life is not going well!) Suggestions for other tools and platforms are welcome.