TILE: Libraries, usage data and recommendation

Tesco know how old my children are. They know this because, through tracking what we buy online or using a loyalty card, they know when we started to buy nappies and baby talc. They have used this information to send us special offer vouchers for baby foods etc. The library at my home institution have similar information about students’ borrowing habits and in principle have access to information about what courses the students were enrolled on and what they access through the VLE. This information was described as a goldmine, but libraries don’t use it. This workshop was about how they might use it, with the related questions of should they and could they.

The analogy to supermarket loyalty cards came from Dave Pattern of Huddersfield University library, where library usage information is being used to enhance their search services. Dave described some of the service enhancements they build on it, for example, “people who borrowed the book you’ve just looked at also borrowed this one”, “based on your previous borrowing you might be interested in this book”, and predicting “borrowing paths” i.e. what comes next. He also uses search log data to enhance services, e.g. looking at common searches that yield no hits and looking for words that are often paired so that that suggestions can be made for refining searches that yield too many results. These are useful, but what else could be done? Dave doesn’t pretend to have all the answers so, in what was described by the organiser of the meeting as “quite a moment” Dave announced that Huddersfield University Library have released their usage data under an open license for other people to experiment with. “You don’t often get moments like this in the library world.”

As well as the data that is passively left by users there is also all the information that is consciously placed on the web by today’s web 2.0 users–bookmarks, ratings, comments, recommendations. What is the potential for using that? Is the library catalogue the place to try to create a social network? Probably not; but the library can probably put information into community networks and can try to harvest it. This however is more difficult for a number of reasons, so it seems a good idea to start with ideas that will work even if such data is not available, and to progress slowly, demonstrating its worth and encouraging the release of more information.

Another theme of the day was “concentration” (i.e. aggregation, as opposed to dissemination or diffusion). We heard from Joy Palmer of MIMAS about COPAC, a union catalogue for research libraries in the UK, and plans for how it will develop. Mark van Harmelen presented an architecture for harvesting usage data and how it might be used with local user profile information and to enhance searches of a union catalogue.

There will be issues around all this,many of which were discussed at the meeting. There are issues related to data protection, academic integrity, pedagogic appropriateness, “homogenization” of the learning experience, systems integration, data quality, data quantity, the reliability of any inferences, potential harm to some stakeholders (what if seems that not many books in the library are actually used?) … etc. But the general outlook from this meeting seemed to be that there were opportunities as well as problems, and in order to find out which were real rather than imaginary it was necessary to get on with some prototypes and experimentation. Doing so in the open, as Dave Pattern is doing, will only help. For example, talk of aggregating usage data sounds intrusive—it’s about users isn’t it–but looking at the data released from Huddersfield Uni. suggests otherwise.

Finally a recollection. I remember about 14 years ago I started using the the Internet Movie Database (which was maybe then still the “Cardiff movie database”, hosted at Cardiff University). They ran a survey asking for suggestions on how it might improve its services. One suggestion stuck in my mind: that the information about a user’s top rated moveis could, by correlation with other users’ ratings, be used to generate suggestions for what else the user might like. A lot has happened since then, Amazon (who bought imdb) are well known for this type of recommendation, but it’s nice to see the idea coming back.

TILE stands for Towards Implementation of Library 2.0 and the e-Framework. It is a JISC funded study looking at how web 2.0 concepts can enhance the library and building shared models to help share experience of doing this.

2 thoughts on “TILE: Libraries, usage data and recommendation

Comments are closed.