We have a meeting coming up on the topic of investigating what data we have (or could acquire) to answer the question of what metadata is really required to support the discovery, selection, use and management of educational resources. At the same time as I was writing a blog post about that, over at OCWSearch they were publishing the list of top searches for their collection (I think Pierre Far is the person to thank for that). So, what does this tell us about metadata requirements?
I’ve been through the terms at the top half of the list (it says that the list is roughly in descending order of popularity, however it would be really good to know more about how popular each search term was) and tried to judge what characteristic or property of the resource the searcher was searching on.
There were just under 170 search terms in total. It doesn’t surprise me that the vast majority (over 95%) of them are subject searches. Both higher-level, broad subject terms (disciplines, e.g. “Mathematics”) and lower-level, finer-grained subject terms (topics, e.g. “Applied Geometric Algebra”) crop up in abundance. I’m not sure you can say much about their relative importance.
What’s left is (to me) more interesting. We have:
- resource types, specifically: “online text book”, “audio”, “online classes”.
- People, who seem to be staff at MIT, so while it’s possible someone is searching for material about them or about their theories, I think it is likely that people are searching for them as resource creators
- level, specifically: 101, Advanced (x2), college-level. These are often used in conjunction with subject terms.
- Course codes e.g. HSM 260, 15.822, Psy 315. (These also imply a level and a subject.)
I think with more data and more time spent on the analysis we could get some interesting results from this sort of approach.