About metadata & resource description (pt 1)

Trying to distinguish between metadata and resource description…

In our online support session for the UKOER programme, some of which John has summarized (1 2 3), instead of giving participants a definition of what metadata is we gave them a choice and asked them to vote on what they understood the word to mean.

The options were:
A: data about data
B: structured information that describes, explains, locates, or otherwise makes it easier to retrieve, use, or manage an information resource.
C: pretty much any information about anything.
D: any of the above.

You might recognise option A as the etymological definition, B as the NISO’s definition, found in Understanding Metadata [pdf]. I was interested in how many people included C in what they understood when they used/heard the term metadata. This was prompted by comment, I forget from whom and in what context, that the idea of metadata defined in option B was fine in a specialized academic sense, but the the word was used more widely and so loosely that you could no longer rely on that being what people meant. In other words you could not assume that someone who said they had metadata would be able provide you with nicely structured machine readable XML/RDF/HTML-Meta tagged information.

Our sample of participants in the online session wasn’t scientifically chosen. Everyone had some connexion with the UK OER programme either working for a project or helping to manage or provide advice to the programme; there were approximately equal representation of managers and technical people (with some overlap, I guess), and one person had a library/information science background (that was my co-presenter, John!). The vote came out as
5 for A: Data about data;
14 for B: Structured information…;
0 for C: any information about anything;
10 for D: any of the above.

In retrospect it’s not surprising that no one voted for C, since the people in our audience who recognise that as a meaning are likely to have come across A and B as well.

Like someone said during the vote, you can tell B is the “right” answer because it is the longest and most formal looking option :-). For me, data about data is too restrictive in range and I think it would be helpful not to call option C/D metadata. I would rather use the term resource description to cover all options and reserve metadata for the structured information about a resource (which includes but is broader than data about data). So metadata tells a computer that 2009-09-11 is to be interpreted as a date in ISO8601 format and is the sort of structured information found in LOM and Dublin Core. Resource description may be metadata or may be free text for people to read. Computers such as those run by Google can do a pretty good job of processing information aimed at people; people (on the whole) aren’t very good at information aimed at computers.

I think that the best view of metadata is that it shows the relationship between resources. “Resources” here means anything that can be identified (if you cannot identify it you cannot show how things are related to it), including: information resources like the OERs, people, places, things, organizations, abstract concepts. What metadata does is express the assertion that this OER (for example) was created by this person. I’ll try to show how this allows the mixing up of metadata and resource description (in a good way) in my next post.