Category Archives: delores

Posts about the Delores UKOER project

WordPress for hosting and describing learning resources

On 5 August I gave a presentation about Delores Selections with the above title to the CETIS Advances in Open Systems for Learning Resources workshop at the Edinburgh Repository Fringe meeting. Below is the powerpoint presentation I used and the (lightly editted) notes taken by Nicola Osborne’s in her live blog of the event.

Slide 1
Delores is: Delivering Open Educational Resources for Engineering Design.

Slide 2
We have static and dynamic collections of university level OERs and other openly available resources relevant to Engineering Design. A static collection may include dynamic resources but the collection itself is static, once set up it stays as it is. Dynamic collections can have new materials added or taken away or developed.

ICBL, School of mathematical and computer sciences, Heriot-Watt University and the University of Bath worked together on this project, funded by HEA and JISC under OER Phase 2.

Slide 3
We used WordPress to gather resources selected by experts in design engineering as being of high quality and usefulness for the collection. We aimed for about 100 objects in that collection of materials. The dynamic collection is everything underneath that. We use a tool called Sux0r which does Bayesian filtering of content – this is how Spam filtering works. We are using that idea the other way around – filtering to detect likely design engineering materials. Then we put material through a tool designed by Bath called Waypoint which enables faceted searching by automatic classification. Because Sux0r pulls RSS feeds from collections we know of, those feeds are continually updated and the collection presented by Waypoint continues to grow. I am going to focus on WordPress but I mention this context to point out that the technically difficult stuff, the effort, the hard thinking wasn’t really in the bit I am talking about.

Slide 4
So, starting off: what do we think we need in order to have this static collection? What are the needs for describing these OERs? First up you may not want to hold an actual copy of the resources. We decided we didn’t want to hold a copy of the resource, these are pre-existing resources hosted elsewhere. What metadata do we need? Title, description, authors, origin, date, subject, classification of some sort, licence, and probably something about the resource type. Users want to see that information, not necessarily locked up in an xml file. We want to embed a preview. We may or may not want to allow comments – but we don’t want to have to manage and spam filter those for the long term. We want something with a good web presence (and findable by Google) and something that has good participation (links in many direction, embedded material, widgets etc. We want it to take part in the web). We want RSS feeds – great for pushing metadata around, we want embedded metadata (thinking RDFa, microformats etc), we want flexibility, want something easy to use and maintain (perhaps familiar), and possibly the option to export metadata?

Slide 5
The idea that we had was to use WordPress. One blog post per resource – if required you can attach resources that are single files to the post. This gives you a basic description and good web presences. WP handles author, date, tite, and you have tags and topics for classification. Also extensions for metadata and additional functionality (a big developer community there).

Slide 6
We weren’t the first people with this idea…

Slide 7 & 8
Oxford’s Triton Project are running the Politics InSpires blog. They are creating OERs within WordPress – describe and comment on current affairs and other items. They have focused on add ons around that blog.

Slide 9 & 10
Edinburgh University have an initiative called OpenMed

Slide 11 & 12
CETIS has been exploring the use of WordPress to disseminate our publications. We see a sneak preview and should note that resources are attached to posts and it looks nothing like a blog

Slide 13
Scriblio (formally WPopac) – WordPress theme to create an OPAC using WordPress

slide 14 & 15
How were our goals met? Well most of what we wanted was possible.
All those question marks are where WordPress gives you information about the post not the resource resource described in the post, which matters for us because we are describing third party resources produced and hosted elsewhere. That is you get the date, author etc of the wordpress post you wrote to describe the resource, which isn’t really what you wanted. You get RSS feeds which link to and are about the descriptions in WordPress, not the resources.

But you do get a good website that is easy to use and maintain and familiar – though the more flexibility you use, the harder it is to maintain.

An Aside
One thing I like about WordPress is Trackbacks – you can see when you’ve been blogged or linked to – people can write about you and you can then aggregate those comments on your post.

Slide 16
So some customisation…

We used WordPress’s custom fields and we adapted a theme so that these are displayed. And we will have either a Plugin or theme extension written so that the right metadata goes into the RSS feed.

slide 17 and demo of Selections site
So lets have a look in the system for bridges

We can find a description and preview of the resource, links to it etc. Looking at the admin screen you can see we are using custom fields to include metadata about the object and we have set up categories that fit the curriculum. Lesa in the audience here wrote all of the resource description – she is a trained librarian and that has really been helpful here.


Training sux0r to recognise design engineering

I spent some time last month training sux0r to recognise what is and isn’t relevant to design engineering, and also to recognise what is relevant to each of the top level topics in the SEED curriculum that we are using to categories resources. We should see the fruits of this training in filtered feeds coming out of sux0r as we add new feeds and as new resources are added to existing feeds. So where do we look for that? For simplicity I shall focus on the more important design relevant/not relevant categorisation here.

The sux0r API provides access to various feeds and other information relevant to the filtering, Santy has written in general terms about what is where in the Delores installation of sux0r. More specifically:

We can see from Return vectors call that the relevant dimension to the categorisation is DesignEngRelevant which has vectorID of 3.

Using this in the ReturnCategories per vector Call we get the categories categoryName=isDesignEng, categoryID=10; and categoryName=notDesignEng categoryID=11.

To get the feeds we use this information in the ReturnItems per Category call. So

Immediate feedback from Chris, who is the project team member who know about Design Engineering, is that the automatic categorisation is working well at the is/isn’t relevant level.

WordPress development needed

The static collection, Delores Selections is hosted on a WordPress installation. The rationale for this was that we could write a description of each resource as a WordPress post and WP plus existing plugins would give us lots of goodness such as a good web presence, category views, RSS feeds, simple search of the whole collection, widgets bring in stuff from other sources, and (if we want them) some DC metadata, OAI-PMH data provision and RDFa. Well nearly.

The problem is that most of the metadata that is exposed through those channels refers to the blog post not the resource, which is not what we want. Want we want is for, e.g., the RSS feed to convey information about the resource: the author, publication date, URL, etc of the resource; what we get is author, pubdate, URL of the wordpress post.

We’re half way to a solution. We’re using the WordPress custom fields to record the author, pub date, CC-licence URI, other rights, source and URI of the resource, and Santy has adapted a the carrington theme so that this information is displayed in the posts. But we don’t have it in the RSS or other ways of exposing the metadata.

So what we are proposing is to work on/comission plugins that will do this. RSS has to be first. The RSS custom fields plugin seems to be a good starting point–hopefully it wouldn’t be too difficult to adapt this to display our feeds in a more standard way than it does at the moment. And then on to PMH, Dublin Core and RDFa…

Now, I am convinced that what we are doing is not just of interest to the Delores project. I think that there is scope for using WordPress for building catalogues of all sorts of materials. For example, with colleagues at CETIS I’m considering whether we could use it to present the publications that come out of CETIS: upload the publication, describe and classify it in a WordPress blog post include a preview and there you have it, a lightweight WordPress publication repository. We’ve also been in touch with the TRITON project to see if there is any commonality between our plugin requirements than theirs.

One problem with extending this to other uses outside Delores is that they might not want to use our theme, they might have one of their own that they like better. So, we should explore whether it is possible to develop a plug-in or widget that would display the custom fields in the relevant blog post instead of using a theme for this. Also, the specific custom fields required may be vary from case to case, so we would have to make sure the plugins had some flexibility to cope with this: possibly no big deal, certainly we would be up for this if it meant that the plugins were more sustainable beyond the end of this project.

A collection by any other name

We decided we needed to move to something a little more user-friendly than the default names of “static” and “dynamic” for the names of the two collections that the Delores project is building. So we are thinking of:

Delores Selections (resources for Design Engineering selected by experts)
Delores Extensions (helping you go further with Design Engineering)

Delores Selections is being built here, but there’s not much content yet.

Describing OERs in the static collection

For the static collection part of the Delores project we will be creating WordPress posts that describe and preview OERs that have been selected from various well-known sources as being particularly useful for engineering design. This is a first cut at a specification for how we will write those descriptions.

About the Resources
The resources being described for Delores are likely to be fairly substantial pieces of content, i.e. between a lecture and a course worth of material rather than individual images. We believe that lecture-sized resources are probably going to be more reusable (especially by teachers), but describing resources at the course level has the advantage of covering more material for the time spent doing the resource description. One aim is clear: all the material covered should be directly relevant to engineering design, so while it makes sense to link to and describe an engineering design course at course level, it does not make sense to link to a general engineering course which has a few resources relevant to design.

The resource descriptions we need can be thought of as being in two parts: a blog post about the resource and some associated metadata about the resource. These are both entered through the WordPress admin interface

Resource Description
Title: the title of the WordPress post shall be the title of the resource, in sentence case. Taken from the resource.

Description: the body of the WordPress post shall be a description of the resource, which we expect to be 3 or 4 paragraphs long. This should refer to the origin of the resource (e.g. MIT open courseware, OU Open Learn), the date and the details of what is in the resources in terms of resource types, subjects covered, academic level (e.g. first-year, introductory, masters-level &c).

If the resource comprises a number of parts that may be of significant use in themselves then link to those parts when describing them (this includes parts of a course, e.g. problem sheets, lessons, and different formats of a resource). Conversely, where a resource is part of a course, book or similar collection it may be worth linking up to the larger aggregation as well as to the atomic elements.

Include a preview of the resource or embed the resource into the blog post. A preview can be a screenshot of the home page for the resource. Previews such as screenshots should link to the resource. Where a resource is hosted on a site such as YouTube, Scribd, SlideShare that faciliates embedding into webpages by providing copy and paste code then make use of this.

Since these are open resources it’s OK to copy large chunks of the description and images from the resource’s web page, but if you do this please put an acknowledgement in square brackets [] at the end of the post.

Subject: Use the WordPress category functionality to specify the topic of the resource. A hierarchical list of Engineering design topics has been adopted for use in the Delores project.

Other properties: [In development, could become another category tree] Use WordPress tags the resource with an indication of the type of resource / level of granularity, e.g. Courseware (= set of resources associated with a course), Online book, lecture recoding, video, audio, powerpoint slides, simulation.

Custom fields
We have used WordPress’s “custom field” functionality to add the following metadata as name/value pairs.

Identifier, URI: The url of the resource being described. Link directly to the resource not to a description of it in some catalogue. If the resource is found in multiple places, where possible link to the copy on the site of whoever released it/published it.

Author:Author name(s) as on the resource, seperated with semicolons. We’re not formatting these.

Licence: URI of the licence under which the resource is released. Will normally be a creative commons licence. Normally the URI will be a link on the resource.

Date: Date of last significant update of the resource. Use format [yyyy[-mm[-dd]]] e.g. 2011 or 2011-01 or 2011-01-29. Be as specific as you can. Watch out for instances where the release date as an OER or the last updated date for the web page differs significantly from the date of the resource.

Rights: human readable statement of copyright owner and any other significant rights, including the licence under which it is released.

Source: the URL for the home page of the collection or initiative through which the resource is released, e.g. for MIT OCW.

Finding OERs

Background: Chris McMahon is the Delores project director. He has a great deal of experience in the management and presentation of information for design engineering and in selecting and using online learning materials, but this project is his introduction to the world of OER. His initial exploration OER-specific resource discovery has left him questioning whether aggregating and searching metadata provided OER producers is the right approach as opposed to customising a generic search to be specific to known OER sites. Chris writes:


My initial reaction from attempting to find material in the OER repositories and collections is that the descriptions of the available material are not particularly helpful in searching and finding resources. For example, I tried to find material on “gear design” in OER Commons. The 30 resources returned for my search were as follows:

Eight audio files from UC Berkeley. All were potentially relevant but little real indication of content was given in the descriptions. I would have to listen to each file to find if it is relevant. Only the title of the audio file indicates that it might be useful (each file has the same abstract, which describes the whole course–not the particular audio file).

The next eight resources were not relevant but included because the word gear appears out of the context of gear design (e.g. landing gear, protective gear) somewhere in the descriptions.

The next resource, MIT Open Courseware “Elements of Mechanical Design”, is very relevant but the reference expands to 17 sets of lecture notes, of which only 2 are relevant. The Abstract is only a very high level description of the whole course and gives no indication of the breadth and relevance of the underlying materials.

The next four resources are not relevant.

The next resource, MIT Open Courseware “Marine Power and Propulsion”, expands to 45 separate lecture documents, of which 2/3 are relevant. Again the abstract is only very high level description and gives no indication of the breadth and relevance of the underlying materials.

The next resource is repeat of the MIT OCW “Elements of Mechanical Design” but from an earlier year.

The next seven resources are not relevant but the descriptions contain words for which gear and design are stems.

In summary – the descriptions are whole course descriptions, not descriptions of the lecture/topic material within the courses. The descriptions (and presumably the RSS feeds) use the same format for single audio files and complete courses.

By contrast, using “gear design” as a search in Google gave very relevant material in the first page of the (327) results. Using the “type=PDF” qualifier was even better as it pulled up the lecture notes. Using the MIT OCW search facility was pretty good also.

What would be really useful would be to have a good search facility that allowed search within known OER repositories – a sort of “Google OER”.


Since talking this through with Chris I have resolved to make a better effort at publicising work that my colleague Lisa Scott has done on Google Custom Search Engines. However there are other implications for the project: in the static collection, how do we select and provide descriptions at the fine level of granularity that Chris wants while also keeping the valuable information of the original course context of the resource; will the quality of the syndicated metadata be good enough for the Bayesian filtering to work; can we supplement this by using information from the course/resource webpage; what use can we make of customised Google searches? (We know the the Triton project are also interested in this last point.)

An introduction to Delores

At the UKOER phase 2 startup meeting the collection strand projects were asked to provide a short introduction saying what their collection was about, who it was for, where the material was coming from and what technical approach was being used. This is roughly what I said about Delores.

We are building static and dynamic collections of open educational resources for Engineering Design.

Engineering Design is the branch of engineering dealing with the design of all engineered products from clothes pegs to Concorde. It deals principally with creating something that will work in a way that satisfies the design need. The design process consists of a number of phases, starting with floating ideas about how the design need might be met, and ending with delivery of a complete and detailed description of the product to be manufactured, based on sound engineering concepts and principles.

The static collection will use a WordPress blog (not this one) to present resources that have been selected for their match to common elements of engineering design curricula from several UK universities.

Resources will come from UKOER projects, OpenLearn, OCWC, Jorum, Xpert, OCWSearch, OER Commons…anywhere we can find them. We will do one blog post per resource, either embedding the resource in the post or linking out to it (depending on nature of resource) and categorise it against topics from the curriculum. WordPress provides web presence, user interface, search + browse, RSS export, and metadata (Dublin core, OAI-ORE) either out-of-the-box or with suitable choice of plug-ins and theme.

The dynamic collection is technically much more interesting. We will draw on the same sources of OERs and use the same curriculum, but this time selection and classification will be automatic.

We will use the output of the JISC Bayesian Feed Filter project to aggregate and select resources. Bayesian filtering is the same process as many spam filters use, but we will be using it to recognise resources about design engineering on the basis of information from RSS feeds. Like a spam filter it needs to be trained with items that it is told how to classify: we will be using the resources from the static collection to show it what design engineering resources look like.

The Bayesian feed filter will produce an RSS feed of items that it thinks are about design engineering. This will be sent to some software developed at Bath called Waypoint.

Waypoint is an integrated search and retrieval system developed for accessing engineering documents (but generalizable to any document corpus) which has two main elements.

In the first element, the documents are organized against a set of classification schemes. This approach is known as facetted classification. The classification is automatic, being carried out using a standard constraint-based classification approach (using carefully selected classification rules) in which pre-coded sets of constraint are used to relate the textual content of each document with the particular topics or themes (characteristically technical and process topics) of interest.

The second element consists of a browsable user interface which provides the user with continuous feed-back of how the search is progressing based on the selections made and keywords used. The user interacts with the system by selecting facet categories of interest. As these are selected the hierarchical display is dynamically pruned to reflect the user’s selection in order to indicate which categories may be used to further refine the selection. This approach is known as Adaptive Content Matching, the effect of which is to present the user at all times with only that part of the classification structure that will lead to a non-null selection.

Waypoint is particularly suited as an interface for students searching for pre-classified educational resources as it allows exploration of the search space in a rewarding way.

It is very intuitive to use, and because of the facetted classification and ACM approaches means that the user is not frustrated by the system merely returning an empty set.

There was some discussion about the relative merits of Blogs and a Wikis as hosts for the static collection. I wouldn’t criticise anyone for choosing to use a wiki for work similar to this, but to my mind the advantage of a wordpress blog: are that the category approach gives good navigation and provides sub-collections (with RSS feeds); the develop community is good, there are plugins for just about anything you might want to do; and the backup/export/migration features are solid. Wikis I think might be better if you want more flexibility in presenting views on parts of your collection (for example wikipedia’s portals) rather than simple list by category and if you want collaborative editting of content, with changelogs, rollback etc.