I keep coming back to thinking about embedding metadata into human-oriented resource descriptions web pages.
Last week I was discussing RDFa vs triple stores with Wilbert. Wilbert was making the point that publishing RDF is easier to manage, less error prone and easier on the consumer if you deal with it on its own rather than trying to deal with encoding triples and producing a human readable web page with valid XHTML all at the the same time. A valid point, though Wilbert’s starting point was “if you’re wanting to publish RDF” and that left me still with the question of when do we want metadata, i.e. encoded machine readable resource descriptions and when do we want resource descriptions that people can read, and do we really have to separate the two?
Then yesterday, following a recommendation by Dan Rehak, I read this excellent comparison of three approaches that could be used to manage resource descriptions or metadata, relational databases, document stores/noSQL, an triple stores/RDF. Which really helps in that it explains how storing information about “atomic” resources is a strength of document stores (with features like versioning and flexible schema) and storing relationships is a strength of triple stores (with, you know, features like links between concepts). So you might store information about a resource as an XML document structured by some schema so that you could extract the title, author name etc., but sometimes you want to give more detail, e.g. you might want to show how the subject related to other subjects, in which case you’re into the world where RDF has strengths. And then again, while author name is enough for many uses, an unambiguous identifier for the author encoded so that a machine will understand it as a link to more information about the author is also useful.
- Building Linked Data For Both Humans and Machines [pdf]
- Linked data for people first, machines second [ppt] (you had to be there, but slide 4 is worth
- An infrastructure service anti-pattern