Harvesting RDF Triples [ Joe Futrelle ]

Abstract:

Managing scientific data requires tools that can track complex provenance information about digital resources and workflows. RDF triples are a convenient abstraction for combining independently-generated factual statements, including statements about provenance. Harvesting is a strategy for asynchronously acquiring distributed information for the purposes of aggregation and analysis. Harvesting typically requires that information be temporally scoped and attributed to some creator or information source. An RDF triple asserts a fact without attributing it to any actor or period of time, so the abstraction must be extended to support typical harvesting scenarios. This paper compares standard, conventional, and non-standard means of extending RDF triples to associate them with attribution and timing information. Then, it considers the implications of these techniques for harvesting and presents some implementation sketches based on a journaling strategy.

Annotation:

Keywords: RDF (Resource Description Framework)_0.8;