Xesam Protocol for Metadata Harvesting

This page is a draft.

This draft is heavily inspired my Philip's proposal for metadata harvesting from Gnome's email client Evolution mixed with some of the ideas found in the widespread standard for harvesting metadata online OAI-PMH.

Concepts

Target: Some application or service exposing a Xesam-PMH DBus interface allowing clients to harvest metadata stored in, or by, the target

Crawler: A specific component inside the harvesting clients that is responsible for aggregating the harvested data

Payload: A nugget of information harvested from a target. A payload stores metadata about one single item signified by the items URI

API

org.freedesktop.xesam.pmh.Target

org.freedesktop.xesam.pmh.Crawler

About Payloads

A payload has the DBus signature

(ssssta(ss))

Conceptually the payload is split into

Payload Header Explained

As ordered by the DBus signature sssst:

Payload Body

The payload body is just and array of string pairs, tuples of (field_name, value).

Why a Push Based Solution

It might seem more natural for the API to be pull based, just like ordinary OAI-PMH, where ListRecords would be used to page through the updated items. The idea of installing a crawler into the target and then have the target push data into the crawler is to enable a more lightweight system for iterative updates.

Consider three different apps wanting to harvest metadata from an email client - preferably these apps want real time updates as emails trickle in. With the API of this spec the situation would be as follows:

Drafts/XesamPMH (last edited 2009-06-11 06:15:12 by MikkelKamstrupErlandsen)