You may have missed that just before Christmas HECoS (the Higher Education Classification of Subjects) was announced. I worked a little on the project that lead up to this, along with colleagues in Cetis (who lead the project), Alan Paull Serices and Gill Ferrell, so I am especially pleased to see it come to fruition. I believe that as a flexible classification scheme built on semantic web / linked data principles it is a significant contribution to how we share data in HE.
HECoS was commissioned as part of the Higher Education Data & Information Improvement Programme (HEDIIP) in order to find a replacement to JACS, the subject coding scheme currently used in UK HE when information from different institutions needs to be classified by subject. When I was first approached by Gill Ferrell while she was working on a preliminary study of to determine if it needed changing, my initial response was that something which was much more in tune with semantic web principles would be very welcome (see the second part of this post that I wrote back in 2013). HECoS has been designed from the outset to be semantic web friendly. Also, one of the issues identified by the initial study was that aggregation of subjects was politically sensitive. For starters, the level of funding can depend on whether a subject is, for example, a STEM subject or not; but there are also factors of how universities as institutions are organised into departments/faculties/schools and how academics identify with disciplines. These lead to unnecessary difficulties in subject classification of courses: it is easy enough to decide whether a course is about ‘actuarial science’ but deciding whether ‘actuarial science’ should be grouped under ‘business studies’ or ‘mathematics’ is strongly context dependent. One of the decisions taken in designing HECoS was to separate the politics of how to aggregate subjects from the descriptions of those subjects and their more general relationships to each other. This is in marked contrast to JACS where the aggregation was baked into the very identifiers used. That is not to say that aggregation hierarchies aren’t important or won’t exist: they are, and they will, indeed there is already one for the purpose of displaying subjects for navigation, but they will be created through a governance process that can consider the politics involved separately from describing the subjects. This should make the subject classification terms more widely usable, allowing institutions and agencies who use it to build hierarchies for presentation and analysis that meet their own needs if these are different from those represented by the process responsible for the standard hierarchy. A more widely used classification scheme will have benefits for the information improvement envisaged by HEDIIP.
The next phase of HECoS will be about implementation and adoption, for example the creation of the governance processes detailed in the reports, moving HECoS up to proper 5-star linked data, help with migration from JACS to HECoS and so on. There’s a useful summary report on the HEDIIP site, and a spreadsheet of the coding system itself. There’s also still the development version Cetis used for consultation, which better represents its semantic webbiness but is non-definitive and temporary.
People always want to know how much LRMI exists in the wild, and now schema.org reports this infomation. Go to the schema.org page for any class or property and at the top it says in how many domains they find markup for it. Obviously this misses that not all domains are equal in extent or importance: finding LRMI on pjjk,net should not count as equal to finding it on bbc.co.uk, but as a broad indicator it’s OK: finding a property on 10 domains or 10,000 domains is a valid comarison. LRMI properties are mostly reported as found on 100-1000 domains (e.g. learning resource type) or 10-100 domains (e.g. educational alignment). A couple of LRMI properties have greater usage, e.g. typical age range and is based on URL (10-50,00 and 1-10,000 domains respectively), but I guess that reflects their generic usefulness beyond learning resources. We know that in some cases LRMI is used for internal systems but not exposed on web pages, but still the level of usage is not as high as we would like.
I also often get asked about support for creating LRMI metadata, this time I’m including a mention of how it is possible to write WordPress plugins and themes with schema / LRMI support, and the drupal schema.org plugin. I’m also aware of “tagging tools” associated with various repositories, e.g. the learning registry and the Illinois Shared Learning Environment. I think it’s always going to be difficult to answer this one as the best support will always come from customising whatever CMS an organisation uses to manage their content or metadata and will be tailored to their workflow and the types of resources and educational contexts they work in.
As far implementation for search I still cover google custom search, as in the previous presentations.
Current LRMI activities
The DCMI LRMI task group is active, one of our priorities is to improve the support for people who want to use LRMI. Two activities are nearing fruitition: firstly, we are hoping to provide examples for relevant properties and type on the schema.org web site. Secondly, we want to provide better support for the vocabularies used for properties such as alignment type (in the Alignment Object), learning resource type etc, by way of clear definitions and machine readable vocabulary encodings (using SKOS). We are asking for public review and comment on LRMI vocabularies, so please take a look and get in touch.
Other work in progress is around schema for courses and extending some of the vocabularies mentioned above. We have monthly calls, if you would like to lend a hand please do get in touch.
Stefan Dietze invited me to give the keynote presentation at the pre-WWW2015 workshop Linked Learning 2015 in Florence. I’ve already posted a summary of a few of the other presentations I saw, this is a long account (from my speaker’s notes) of what I said. If you prefer you can read the abstract of my talk from the WWW2015 Companion Proceedings or look through my slides & notes on Google. This is a summary of past work at Cetis that lead to our invovement with LRMI, why we got involved, and the current status of LRMI. There’s little here that I haven’t written about before, but I think this is the first time I’ve pulled it all together in this way.
Lorna M. Campbell was co-author for the presentation; the approach I take draws heavily on her presentation from the Cetis conference session on LRMI. Most of the work that we have done on LRMI has been through our affiliation with Cetis. I’ll describe LRMI and what it has achieved presently. In general for this I want to keep things fairly basic. I don’t want to assume a great deal of knowledge about the educational standards or the specifications on which LRMI is based, not so much because I think you don’t know anything, but because, firstly I want to show what LRMI drew on, but also because whenever I talk to people about LRMI it becomes clear that different people have different starting assumptions. I want to try to make sure that we kind of align our assumptions.
LRMI prehistory and precursors
I want to start by reviewing some of what we (Lorna and I and Cetis) did before LRMI and why we got involved in it.
That means talking about metadata. Mostly metadata for the purpose of resource discovery, in order to support the reuse of educational content; we want to support the reuse of educational content in order to justify greater effort going in to the creation of better content and allowing teachers to focus on designing better educational activities. We were never interested in metadata just for its own sake, but, we felt that however good an educational resource is, if you can’t find it you can’t use it.
And we can start with the LOM, the first international standard for educational technology, designed in the late 1990’s, completed in 2002 (at least the information model was–the XML binding came a couple of years later; other serializations such as RDF were never successfully completed)
We had nothing to do with designing the LOM.
But we did promote its use, for example:
I worked on a project called FAILTE, a resource discovery service for Engineering learning resources, which involved people with various expertise (librarians, engineering educators, learning technologists) creating what was essentially a catalogue of of LOM records.
I was also involved in a wider initiative to facilitate similar services across all of UK HE, by creating an application profile for use by joint projects of two organisations, RDN & LTSN (RLLOMAP)
Meanwhile Lorna was leading work to create an application profile of the LOM with UK-wide applicability (UK-LOM core)
These were fairly typical examples of LOM implementation work at that time. Also, none of them still exists.
All these involve application profiles, that is tailoring the LOM by recommending a subset of elements to use and specifying what controlled vocabularies to use to provide values for them (see metadata principles and practicalities, section IIIA). And there’s a dilemma there, or at least you have to make a compromise, between creating descriptions which make sense in a local setting and meet local needs, and getting interoperability between different implementations of the LOM.
In fact some of the initial LRMI research was a survey of how the LOM is used, looking at LOM records being exposed through OAI-PMH found that most LOM records provided very little beyond what could be provided with simple Dublin Core elements, which agreed with previous work comparing different application profiles (e.g. Goodby, 2004). (See also a similar study by Ochoa et al (2011) conducted at about the same time which focussed repositories that had been designed to use the LOM.)
But I wasn’t talking about the LOM in Florence. Why not? Well, IEEE LOM and IMS Metadata have their uses, and if they work for you that’s great. But I’ve also mentioned some of the problems that we faced when we tried to implement the LOM in more or less open systems: lots of effort to create each record, compromise between interoperability and addressing specific requirements. The structure of the LOM as a single XML tree-like metadata record comprising all the information about a resource does little to help you get around these problems. It also means that the standard as a whole is monolithic: the designers of the LOM had to solve the problems of how to describe IPR, technical, lifecycle issues, and others (then consider that many different resource types can be used as learning resources, and what works a technical description of a text document might not work for an image or video). Solving how to describe educational properties is quite hard enough without throwing solutions to all of these others into the same standard.
So, having learnt a lot from the LOM, we moved on hoping to find approaches to learning resource description that disaggregated the problem (at both design and implementation stages) into smaller less intimidation tasks.
I want to mention some work on Semantic technologies and what was then beginning to be called linked data that Cetis helped commission and were involved in through a working group aournd 2008 – 2009. The Semantic Technologies in Learning and Teaching Jisc mini-project / Cetis working group run by Thanassis Tiropanis et al out of the University of Southampton. The SemTech project aimed to raise the profile of semantic technologies in HE, to highlight what problems they were good at solving. The project included a survey of then-existing semantic tools & services used for education to discover what they were being used for. (they found 36, using a fairly loose definition of “semantic”.
The “five year plan” outlined by that project is worth reflecting on. Basically it suggested that exposing more data which can be used by applications, thus encouraging more data to be released (a sort of optimistic virtuous cycle), and the development of shared ontologies which yield benefits when there you have large amounts of data (Notably, it didn’t suggest spending years locked in a room coming up with the one ontology that would encompass everything before releasing any data).
The development of semantic applications for teaching and learning for HE/FE over the next 5 years could be supported in a number of steps:
Encouraging the exposure of HE/FE repositories, VLEs, databases and existing Web 2.0 lightweight knowledge models in linked data formats. Enabling the development of learning and teaching applications that make use of linked data across HE/FE institutions; there is significant activity on linked open data to be considered
Enabling the deployment of semantic-based searching and matching services to enhance learning. Such applications could support group formation and learning resource recommendation based on linked data. The development of ontologies to which linked data will be matched is anticipated. The specification of patterns of semantic tools and services using linked data could be fostered
Collaborative ontology building and reasoning for pedagogical ends will be more valuable if deployed over a large volume of education related linked data where the value of searching and matching is sufficiently demonstrated. Pedagogy-aware applications making use reasoning to establish learning context and to support argumentation and critical thinking over a large linked data field could be encouraged at this stage.
Our first efforts outside of IEEE LOM were in the Dublin Core Education Application Profile Task Group , between about 2006-2011, attempting to work on a shared ontology. Meanwhile others (notably Mikael Nilsson, KTH Royal Inst Technology, Stockholm) worked to get LOM data in RDF. This work kind of fizzled out, but we did get an idea of a domain model for learning resources, which rather usefully separated the educationally relevant properties from all the others. The cloud in the middle represents any resource-type specific domain model (say one for describing videos or one for describing textual resources) to which educationally relevant properties can be added. So this diagram represents what I was saying earlier about wanting to disaggregate the problem space so that we can focus on educational matters while other domain experts focus on their specialisms.
I want to mention in passing that around this time (2008/9) work started at ISO/IEC on semantic representation of metadata for learning resources. This was kicked off in response to the IEEE LOM being submitted for ratification as an ISO standard… and it is still ongoing. We’re not involved. Cetis has done no more than comment once or twice on this work.
In fact we did very little metadata work for a while. I thought I was done with it.
At this time there was there was a an idea in educational technology circles that was encapsulated in the term #eduPunk, the idea was that lightweight personal technology could be used to support teaching and learning, a sort DIY approach to learning technology, without the constraints of large institutional, enterprise level systems–WordPress instead of the VLE, folksonomies instead of taxonomies.
In comparison to eduPunk, we were #eduProg. I’ve nothing against the virtuoso wizardry of ProgRock or a technically excellent OWL ontology, and I am not saying there is anything wrong in either. The point I am trying to make is that the interest and attention, the engagement from the Ed Tech community was not in EduProg.
The attention and engagement was in Open Educational Resources, and we supported a UK HE 3 year, £15Million programme around the release of HE resources under creative commons licences [UKOER]. Cetis provided strategic technical advice and support to the funder and to the 66 projects that released over 10,000 resources. The support include guidelines on technology approaches to the management, description and dissemination of OERs; the guidelines we gave were for lightweight dissemination technologies, minimal metadata, and putting resources where they could be found. We reflected at length on the technology approaches taken by this programme in our book Into the wild – Technology for open educational resources. We recognise the shortcomings in this approach, it’s not perfect, and some people were quite critical of it. If we had been able to point to any discovery services that were based on the LOM or any more directed approach that were unarguably successful we would have recommended it, but it seemed that Google and the open web was at least as successful as any other approach and required less effort on the part of the projects. Partly through UKOER we did see 10,000 resources and more importantly a change in culture so that using social sharing site for education became unremarkable, an I would rather have that than a few 100 perfect metadata descriptions in a repository.
As far as resource description and resource discovery is concerned I think the most important advice we gave was this:
LRMI launched in 2011. what about it got us back into educational metadata? Let’s start from first principles, and look at the motivation behind LRMI, which is to help people find resources to meet their specific needs. I’ll try to illustrate this.
Meet Pam, a school teacher. Let’s say she wants to teach a lesson about the Declaration of Arbroath.
[See credits, below]
What are her specific needs? Well, they relate to her students: their age, their abilities; to her teaching scenario: is she looking for something to use as part of a half hour lesson on a wider topic, or something that will provide a plan of work over a few lessons? introduction or revision? And there is the wider context, she’s unlikely to be teaching about the declaration of Arbroath for its own sake, more likely it will relate to some aspect of a wider curriculum, perhaps history but perhaps also something around civic engagement in Scotland, or relations between Scotland and England, or precursors to the US declaration of independence, but she will be doing so because she is following some shared curriculum or exam syllabus.
She searches Google, finds lots of resources, many of them are no more than the text of the resource.
There are also tea towels and posters.
Those that go further do not necessarily do so in a way that is suitable for her pupils. There’s a Wikipedia article but that’s not really written with school children in mind. Google doesn’t really support narrowing down Pam’s search to match her requirements such as the age and educational level of students, the time required to use in a lesson, the relevance to requirements of national curriculum or exam syllabus, so Pam is forced to look at a series of separate search services based (often) on siloed metadata [examples 1, 2, 3]. It’s worth noting that the examples show categorisation by factors such as Key Stage (i.e. educational level in the English National Curriculum), educational subject, intended educational use (e.g. revision) and others, giving hints as to what Pam might use to filter her search. Google (historically) hasn’t been especially good at this sort of filtering, partly because it cannot always work out the relevance of the text in a document.
What happened to make us think that it was worth addressing this problem was schema.org:
a joint effort, in the spirit of sitemaps.org, to improve the web by creating a structured data markup schema supported by major search engines.
An agreed vocabulary for naming the characteristics of resources and the relationships between them.
Which can be added to HTML (as microdata, RDFa or JSON-LD) to help computers understand what the strings of text mean.
Adding schema.org markup (as microdata) to HTML, turns the code behind a web page from something like:
<h1>Learning Resource Metadata Initiative: using schema.org to describe open educational resources</h1>
<p>by Phil Barker, Cetis, School of Mathematical and Computer Sciences, Heriot-Watt University <br />
Lorna M Campbell, Cetis, Institute for Educational Cybernetics, University of Bolton. April 2014</p>
i.e. just strings, not much to hint as to which string is the authors name, which string is the title of the paper, which string is the author’s affiliation. to something like
<div itemscope itemtype="http://schema.org/ScholarlyArticle">
<h1 itemprop="name">Learning Resource Metadata Initiative: using schema.org to describe open educational resources</h1>
<p itemprop="author" itemscope itemtype="http://schema.org/Person">
<span itemprop="name">Phil Barker</span>,
<span itemprop="affiliation">Cetis, School of Mathematical and Computer Sciences, Heriot-Watt University</span></p>
<p itemprop="author" itemscope itemtype="http://schema.org/Person">
<span itemprop="name">Lorna M Campbell</span>,
<span itemprop="affiliation">Cetis, Institute for Educational Cybernetics, University of Bolton</span></p>
</div>
where the main entities and their relationships are marked and text that related to properties of those items is identified: a Scholarly Article is related to two Persons who are the authors; some of the text is the name of the Scholarly Article (i.e. its title), the names of the Persons and their affiliations. Represented graphically, we could show this information as:
An entity – relation graph identifying the types of entities, their relationships to each other and to the strings that describe significant properties.
At this point the LRMI was initiated, a 3 year project funded by the Bill and Melinda Gates foundation (and later wth some additional funding from the Hewlett Foundation), managed jointly by one organisation committed to open education (Creative Commons) and another (AEP) from the commercial publishing world, with input from education,publishers and metadata experts.
I was on the technical working group. We issued a call for participation; gathered use cases; and did the usual meeting and discussing to hammer out how to meet those use cases. We worked more or less in the open,–there was an invitation only face to face meeting near the beginning (limited funding so couldn’t invite everyone) after that all the work was on open email discussion lists and conference calls. Basing the work on schema.org allowed us to leave all the generic and resource-format specific stuff for other people to handle, and we could focus just on the educational properties that we needed.
The slide on the left shows what came out. The first two properties are major relationships to other entities, and alignment to some educational framework and the intended audience, the others are mostly simple characteristics. All are defined in the LRMI specification. In a previous blog post I have attempted further explanation of the Alignment Object. Most of these were added to Schema.org in 2013, the link to licence information was added later.
Current state of LRMI and future plans.
LRMI has been implemented by a number of organisations, some with project funding to encourage uptake, others more organically. One of the nice things about piggy-backing on schema.org is that people who have never heard of LRMI are using it.
Not every organisation on this list exposes LRMI metadata in its webpages, some harvest it or create it and use it internally. The Learning Registry is especially interesting as it is a data store of information about learning resources in many different schema, which uses LRMI as JSON-LD for (many of its) descriptive records. We have reported in some depth on the various ways in which LRMI has been implemented by those projects who are funded through the initiative.
We can create a Google custom search engine that looks for the alignment object–this in itself is a good indicator that someone has considered the resource to be useful for learning; and we can add filters to find learning resources useful for specific contexts, in this case different educational levels. This helps Pam narrow down her search–at least in a proof of concept way, as they stand these are not intended to be useful services.
I would like to note the following points from these implementations:
they exist. That’s a good first step.
not every implementation exposes LRMI metadata, some use it internally.
there is no agreement on value spaces, either terms or meanings (e.g. educational level, 1st Grade, Primary 1).
The Gates funding for LRMI is now complete, and as an organization LRMI is now a task group of the Dublin Core Metadata Initiative. That provides us with with the mechanisms and governance required to maintain, promote, and if necessary extend the specification. It does not mean that LRMI terms are DC terms, they’re not, they’re in a different namespace. DCMI is more than a set of RDF terms, it’s a community of experts working together, and that’s what LRMI is part of. The LRMI specification is now a community specification of DCMI, conforming to the the requirements of DCMI, such as having well-maintained definitions in RDF, which align with the schema.org definitions but are independently extensible.
The planned work of task group is shown on the group wiki, and includes:
Extending LRMI: Events? Courses?
contributing via new schema.org extension mechanism?
Recommended value vocabularies
Linked data representation of educational frameworks (alignment)
(There’s also a background interest in the use of LRMI beyond the original schema.org scenario, for example as stand-alone JSON-LD or as EPUB metadata for eTextBooks)
It’s customary to allow time for the audience to ask difficult questions of the presenter. I tried to forestall that by asking the audience’s opinion on these questions:
Does this help with the endeavour to expose lightweight linked data?
(can you get the data out of web pages?)
How do we encourage linked data representation of educational frameworks?
How much goes into schema.org (or similar) or should we just reuse existing ontologies?
Can you cope with the the quality of data that can be provided at web-scale?
Reflections on the presentation
As far as I could judge from the questions that I couldn’t answer well, the weak points in the presentation, or in LRMI may be, seem to be around gauging the level of uptake: how many pages are there out there with LRMI data on them? I don’t know. The schema.org pages for each entity show usage , for example the Alignment Object is on between 10 and 100 domains, but I do not know the size of those domains. That also misses those services that use LRMI and do not expose it in their webpages but would expose it as linked data in some other format. I suspect uptake is less than I would like, and I would like to see more.
As presenter I was happy that even after I had talked about all that for about 45 minutes, there were people who wanted to ask me questions (the forestalling tactic didn’t work), and even after that there were people who wanted to talk to me instead of going for coffee. That seems to be a good indicator that there was interest from the workshop’s audience.
Image credits:Photo of Pam Robertson, teacher, by Vgrigas (Own work) [CC-BY-SA-3.0 ], via Wikimedia Commons; reproduction of Tyninghame (1320 A.D) copy of the Declaration of Arbroath, 1320, via Wikimedia Commons. Logos (Heriot-Watt, Cetis, LRMI, Semtech etc.) are property of the respective organisation. Unless noted otherwise on slide image, other images created by the authors and licensed as CC-BY.
When the Learning Resource Metadata Initiative (LRMI) technical working group started its work it focused on identifying the properties and relationships that were important for educational resources but could not be adequately expressed using schema.org as it then stood. One of those important pieces of information was the licence under which a resource was released, and so the LRMI spec from the start had the property useRightsUrl “The URL where the owner specifies permissions for using the resource.” When schema.org adopted most of the LRMI properties, useRightsUrl was an exception, it was not adopted pending further consideration–not surprising really given the wide-ranging applicability of licence information beyond learning resources.
Back in June the good news came that with version 1.6 of schema.org included a license property for Creative Works that does all that LRMI wanted, and more.
What does this mean for LRMI adopters?
Some adopters of LRMI have already started using useRightsUrl. Such implementations are valid LRMI but not valid schema.org, which means that they will only be understood by applications that have been written specifically to understand LRMI and not by the general purpose web-scale search applications. This is sub-optimal.
In passing, let me mention another complication. With schema.org you have a choice of syntax: microdata and RDFa 1.1 lite. With RDFa there was already a mechanism for identifying a link to a licence, that is rel=”license”. Just to complicate a little more, RDFa allows name spacing, and the term license appears in at least three widely used namespaces: HTML5, Dublin Core Terms, and the Creative Commons Rights Expression Language–hopefully this will never matter to you. To exemplify one of these options I’ll use the HTML that you get when you use the Creative Commons License Chooser (but let’s be absolutely clear, what I am writing about applies to any type of license whether the terms be open or commercial):
The good news is that all these options play nicely together, you can have the best of all worlds.
If you are already using itemprop=”useRightsUrl” to identify the link to a licence using LRMI in microdata, you can also use the license property and rel=”license”. The following LRMI microdata with a bit RDFa thrown in works:
If you are using LRMI / schema.org in RDFa, then the following is valid
<html>
<body vocab="http://schema.org/" typeof="CreativeWork">
<a rel="license useRightsUrl"
href="https://creativecommons.org/licenses/by/4.0/">
Creative Commons Attribution 4.0 International (CC BY 4.0) licence
</a>
</body>
</html>
License does what LRMI asked for and more
In my opinion the schema.org license property is superior to the LRMI useRightsUrl for a few reasons. It does everything that LRMI wanted by way of identifying the URL of the licence under which the creative work is released, but also:
It belongs to a more widely recognised namespace, especially important if you are wanting to generate RDF data
I prefer the semantics of the name and definition: a license can include restrictions of use as well as grant rights and permissions.
the range, i.e. the type of value that can be provided, includes Creative Works as well as Urls
That last points allows one to encode the name, url, description, date, accountable person and a whole host of other information about the licence (albeit at the cost of the not being able to do so alongside LRMI’s useRightsUrl quite so simply)
Summary
The inclusion in schema.org of the license property is good news for aims for LRMI. If you use LRMI and care about licensing you should tag the information you provide about the license with it. If you already use LRMI’s useRightsUrl or RDFa’s rel=”license” there is no need to stop doing so.
One question that we always get asked about LRMI is “who is using it?” There are two sides to this, use by search service providers and use by resource providers, this post touches on the latter.
In phase 2 of the LRMI project, various organizations were given small amounts of money to implement LRMI in their systems and workflows. Those organizations are listed on the Creative Commons web site, and Lorna is in the process of gathering together the lessons they learnt which will be reported back shortly. Perhaps more importantly, at least from the point of view of sustainability, are implementations that arise spontaneously, either by organizations with learning resources to disseminate who make a conscious decision to use LRMI, or those who in using schema.org markup find that one of the properties that LRMI added is appropriate. Of course no one doing this is under any obligation to inform us of what they are doing, so it is harder to keep track of such use. Fortunately the Google Custom Search Engine Wilbert and I cobbled together can be used to discover such implementations. It’s a bit hit-and-miss, you need to search for common topics (Math, English) and trawl through the results for new sites, but it’s better than nothing.
Listed below (in alphabetical order) are the sites we’ve found, a link to a sample page with embedded schema.org / LRMI, a link to the Google Structured Data testing tool results for that page, and sometimes a note or two.
BBC Knowledge and Learning (Beta) BBC education resources example page, testing tool result
Comment: uses typical age range and alignments to education level and subject. Alignment object name and Url properties used when targetName and targetUrl should have been.
Brokers of expertise State of California resources for educators example page, testing tool result
Comment: uses several properties, but be wary of using alignment object name and url for target and of intended end user role as property of creative work not audience.
CTE Online career technical education. example page, testing tool result
Comment: several properties used, but be wary of using alignment object name and url for target and of intended end user role as property of creative work not audience.
Of course there are others using LRMI properties in their webpages that I happened not to find (t.b.h. I didn’t spend very long looking) and more who are using them to support internal business processes that Google never sees. If you know of an interesting use of LRMI from which others might learn, post a comment below (if comments are closed, contact me).
On 17th-18th June, in Bolton, Cetis had their more-or-less annual conference. One of the sessions was Lorna and me, with some help from our friends, discussing LRMI addressing the question “What on Earth Could Justify Another Attempt at Educational Metadata?”
Lorna started with an overview of our involvement in educational metadata, from EEVL and FAILTE, through IMS Learning Resource Meta-data, RLLOMAP and the UK LOMCore to Dublin Core Education, up to ISO-MLR. She then described the origin of LRMI.
So, yes, we really love metadata, but reached a point where making ever-more elegantly complex iterations on the same idea kind of lost its appeal. So what is it that makes LRMI so different so appealing? I gave a technical overview (basically a summary of the recent Cetis paper What is schema.org? and my blog post on explaining the LRMI alignment object.
So, the difference is that LRMI/schema.org metadata is deeply embedded in the web to the extent that it is right in the pointy brackets of the HTML code of web pages, marking up what humans can see, which crucially is where Google and other search engines want it. (That is not to say that it cannot be useful elsewhere.) Which is great, but what about implementation, at what stage can we show that some tangible benefits are on the way? That needs webpages that carry LRMI mark-up and a means of searching for them. I presented a summary of the work that Wilbert and I did on building a Google Custom Search Engine and filtering Google custom searches on LRMI properties, it also pointed to some work in schema-labs on a custom search for education.
Those are first steps, proofs of concept, there seemed to be agreement that they showed potential but obviously there needs to be more coherence if they are to work as a useful discovery service in real life. What about the other first step that needs to be taken, getting those who disseminate resource to use LRMI in their resource description pages? Well, first Lorna presented on her discussions with organizations who received a little funding to implement LRMU as part of phase 2 of the initiative.
Then Ben Ryan of Jorum discussed his work in implementing schema.org / LRMI in DSpace. The integration of LRMI into repository platforms and content management systems is key to getting it used widely across the web. I’m pleased to report back that Ben didn’t report problems with the spec itself, though as always there are questions around workflow and metadata quality.
Finally I gave a short over view of some of the sites that we have found to be using LRMI because they show up in the Custom Search Engine results.
The general feeling I had from the session was that most of the people involved thought that LRMI was a sane approach: useful, realistic and manageable. One of my favourate comments during that presentation was from Jenny Gray who tweeted
Have apparently been doing #lrmi in openlearn since before it was a thing. Cant work out how!
and commented that she wasn’t sure whether what was in the OpenLearn pages was LRMI. Well, it is schema.org with properties that came from phase one of the initiative (seems someone has been extending on the work Jenny did), embedded as RDFa which was an approach for structuring data in webpages that predated schema.org. And I think it is really promising for adoption of LRMI that you don’t need any specialist knowledge of educational metadata standards in order to find yourself using it. With this widespread almost accidental adoption comes a challenge: this work isn’t happening in the highly controlled world of information experts (librarians, or semantic experts used to working with descriptive ontologies) it’s happening web-wide, where web developers / web masters will take liberties in order to say what they want to say. Martin Hepp describes the significant changes of approach involved in this move from ontologies to web ontologies in a video presentation that I cannot recommend highly enough. Thinking about the minimalism of simple Dublin Core and the EduProg of LOM and ISO MLR, I see this as the challenge of having freedom of expression but keeping coherence, is this the shape of metadata to come?
I had the pleasure yesterday to talk on the Mozilla Open Badges community call about how LRMI and Open Badges may intersect. Open Badges are a means of displaying digital recognition of skills and achievements, there’s a technical framework behind the badges that offers the means of providing data in support of the claimed achievement. A particular part of this technical framework is the assertion specification, which includes a pointer from each badge to “the educational standards this badge aligns to, if any”. This parallels the LRMI alignment object very closely: in short the educationalAlignment property that LMRI added to schema.org allows encoding of statements along the lines of “this resource [teaches|assess|requires|has level] X” where X is some point in an shared educational framework, e.g. of attainment standards, topics or educational levels or shared curriculum. Diagrammatically
The creative work aligns with a node in an educational framework. The alignment object identifies that node and the nature of the alignment.
The Mozilla badge alignment object is described thus:
Property
Expected Type
Description
name
Text
Name of the alignment.
url
URL
URL linking to the official description of the standard.
description
Text
Short description of the standard
and an example is provided
{
"name": "Awesome Robotics Badge",
...
"alignment": [
{ "name": "CCSS.ELA-Literacy.RST.11-12.3",
"url": "http://www.corestandards.org/ELA-Literacy/RST/11-12/3",
"description": "Follow precisely a complex multistep procedure when carrying out experiments, taking measurements, or performing technical tasks; analyze the specific results based on explanations in the text."
}]
...
}
Diagrammatically:
The badge information includes an assertion that the skill or achievement aligns with some point in an educational standard
Not only do the LRMI and Open Badge alignment objects both do the same thing they seem to have have the following semantically equivalent properties relating to identifying the thing that is aligned to:
(I like to think that this is not coincidence, but I don’t know how the similarity arose.)
The differences:
Open Badges do not identify the type of alignment. It has no need, I guess, since the alignment is always one of “asserts ability at” or something similar. LRMI currently recommends no relevant value.
Open Badges do not name the framework, I guess the assume that identifying the node will lead to knowledge of the framework. LRMI felt that this would not always be enough.
The LRMI alignment object can be used in conjunction with a property of schema.org/CreativeWorks, I don’t think Mozilla open badge assertions are creative works in that sense, I think they are some type of schema.org/Intangible.
Syntactically, OpenBadge assertions are made using JSON, I don’t think they use microdata. Through schema.org, LRMI uses microdata and JSON-LD.
aligning the alignment objects
The discussion that I hope to kick off with the Mozilla Open Badge and LRMI communities is should/could we make the similarities between the two alignment objects more explicit? This would give developers a two-for-one offer, understand the way Open Badges expresses alignment and you’ve understood what LRMI does, and vice versa. I don’t suppose either group wants to change a spec that is in productive use, but an informative statement about the similarities could be provided without changing either.
Beyond that I wonder if the Open Badge community have thought about use of schema.org when advertising badges, i.e. if you provide a webpage saying “we offer the following badges for X, Y and Z” would there be benefit in marking this up with schema.org microdata to improve discoverability by search engines? If there is benefit in doing so, then it would be worth thinking about what type of schema.org Thing badges are and how the LRMI alignment object might be attached to it.
The bigger picture is that someone working with the starting point of wanting to learn about something could find resources to help them learn it with the help of LRMI alignments and discover the means of showing that they had learnt it via Open Badge alignments.
The educational alignment property and the associated alignment object that LRMI introduced into schema.org have been described as the “killer feature” for LRMI. However, I know from the number of questions asked about the alignment object and from examples I have seen of it being used wrongly that it is not the easiest construct to understand.
Perhaps the problems come from the nature of the alignment object as a conceptual abstraction, so maybe it will be help to show some concrete examples of how it may be used. However, bear in mind that the abstraction was a deliberate design decision made so that the alignment object should be more widely applicable than the examples given here. So I will first discuss a little about why some simpler more direct approaches were considered and rejected (as were some approaches that would be even more abstract).
basic use case
The general use case for which the alignment object was introduced to meet was , in brief,
“help people find resources that can be useful in teaching or learning in some specific scenario.”
That looks deceptively simple. The complications come when defining the “specific scenario” and unpacking the word “useful”.
enter “educational frameworks”
One practical approach to defining various aspects of the specific scenario involves reference to an educational framework of some sort. By educational framework I mean a structured description of educational concepts such as a shared curriculum, syllabus or set of learning objectives, or a vocabulary for describing some other aspect of education such as educational levels or reading ability.
“Educational framework” is a deliberately broad concept as we wanted LRMI to be applicable globally and across many levels and modes of education. Some specific examples are school-level curricula or attainment standards such as:
Perhaps more relevant to higher education many professional bodies define the competencies required to become a member of their profession, for example:
As well as having a role in defining competences and outcomes, measures of academic level or difficulty may be useful independently as reference points, for example:
the US K12 grade levels are well understood in terms of school level,
various empirical measures of reading difficulty, for example general idea of “reading age” and the specific measures of reading ability and text level used by lexile.
One the other hand you may just want to specify the subject being taught, or the educational discipline for which is it being taught. Various classifcation schemes for academic subjects are available, for example:
All of these frameworks (and many others) may be used to describe aspects of an educational scenario.
ways of being useful
Life isn’t simple enough for us to meet the use case described above by adding a single property to schema.org Creative Works to say that the resource “aligns with” (i.e. is useful in the context defined by) some entry or node in an educational framework. In prescribing a “useful” resource we would want to distinguish between resources that teach and asses a topic; we also want a resource that assumes suitable previous knowledge, or requires some specific reading level, or assumes a certain general academic level. There may be other forms of alignment. There isn’t agreement on a minimum core set of properties required to address that word “useful” in the use case, but there is agreement that a resource can “align” with an “educational framework” in several ways, some of which we can enumerate. Hence the birth of the alignment property and abstract Educational Alignment object.
the abstraction
I think of it like this:
We start with a Creative work:
and an educational framework: (Note, there is no schema.org class of type EducationalFramework, but we assume that we can refer to some of the following properties pertaining to it: some text that identifies the framework as whole (let’s call it a name), and the URLs, names and/or descriptions of nodes within the framework.)
The alignment object was created to describe the relationship between the two. The following properties alignment objects are defined: educationalFramework, which can be used to hold text that identifies the educational framework you are pointing to; targetDescription, targetName and targetURL, which can hold the values that correspond to properties we assumed that nodes in the educational framework would have. It also has an alignmentType property that I think of switching the object to specify the different types of alignment that are possible. So we can put them together to express an alignment between a creative work and some node in an educational framework:
common mistakes
I have seen both of these mistakes in actual markup of webpages.
1. the alignment object on its own is fairly meaningless. Unless it is referenced by the educational Alignment property of a creative work it’s as useful as half a link.
2. since the alignment object is a proper schema.org Thing (to be specific a subtype of an Intangible Thing) it inherits the properties that every schema.org Thing has. e.g. a name, a URL, a description an image. Some of these make some sense in some cases (see below) but importantly, none of them are used in expressing the alignment: the url of an alignment object is not the same as the url of the creative work or the node to which it aligns.
real-world examples of alignment assertions
I would like to use two real-world examples of where services provide information that can be seen as an assertion that a resource is useful in connection with (i.e. aligns with) an educational framework:
1. Kritikos, where students can tell other students what is useful for their course.
Screen shot of Kritikos information page about an MIT OCW lecture video. See it in kritikos.
Kritikos is a custom search engine for visual media relevant to teaching and learning engineering. In part the customisation comes through the use of a Google CSE, but more relevant to this post is the part that comes through allowing users to classify whether resources found on it are useful for specific courses [aside: this part of the kritikos service is built on a Learning Registry node].
The example shown here is the kritikos information page for a video of a lecture from MIT Open CourseWare. It includes “what others are saying about this resource” with the information from a year 3 MEng Aerospace Engineering student that it is relevant to “Flight Dynamics and Control”. The link from this assertion leads to other resources deemed useful by users for that module. “Flight Dynamics and Control” is a module at the University of Liverpool (code AERO317) that exists within the framework of Liverpool’s Aerospace Engineering programme. It is worth noting that kritikos can also be used to record when a resource is not relevant to a course–this is useful for weeding out false positives that get through the Google custom search engine. [Disclosure/bragging: I had an advisory role in the project that lead to kritikos.]
So, there’s an expression of an educational alignment; how does it relate to the alignment object?
The creative work in question is the MIT lecture (to be precise it’s a http://schema.org/VideoObject), we could describe a few of its characteristics with schema.org properties:
name = “Lec 7 | MIT 16.885J Aircraft Systems Engineering, Fall 2005”
url=http://www.youtube.com/watch?v=2QRfkG7jOfY
duration = PT110M22S
I’m not guessing this, the YouTube page has Schema.org microdata in it.
The node in the educational framework is a bit less well defined, but we would be justified in calling the module description a node in a framework called “University of Liverpool Modules” and saying the name for this node is “AERO317”, its description is “Flight Dynamics and Control”. It has a page on the web which gives us a url, http://tulip.liv.ac.uk/mods/vital/vital_AERO317_200809.htm. So we can express the alignment:
item type=http://schema.org/VideoObject
name = "Lec 7 | MIT 16.885J Aircraft Systems Engineering, Fall 2005"
url = http://www.youtube.com/watch?v=2QRfkG7jOfY
duration = PT110M22S
educationalAlignment = item1
item1 type= http://schema.org/AlignmentObject
alignmentType = "Teaches"
educationalFramework = "University of Liverpool Modules"
targetName = "AERO317"
targetDescrption = "Flight Dynamics and Control"
targetUrl = http://tulip.liv.ac.uk/mods/vital/vital_AERO317_200809.htm
What about the other properties of the AlignmentObject, the ones it inherited by virtue of being an official Intangible Thing in the schema.org hierarchy? Well you could envisage the image property pointing to the screenshot above, and the url property being a url with a fragment identifier that points to the “what others are saying” part of the kritikos page. Sure, you can give it a name and descriptions if you want to. Maybe these aren’t especially useful, but the point it that they are clearly different from the url, name and description of the University of Liverpool course to which the MITOCW video aligns.
2. OER Commons, aligning to US Common Core State Standards
I’ll cover this in less detail. The main problem with the example above is that the educational framework, while locally useful, is somewhat ad hoc we had to kind of look at the course structure at Liverpool University in a certain way to see it as an educational framework. Better examples of a more widely shared and more formally constructed educational frameworks are those of the US Common Core State Standards Initiative. OER Commons is a repository and search engine for Open Educational Resources that expresses alignment to these frameworks in its descriptions.
The screenshot on the left shows such an alignment being displayed (the image links to the actual page in question, which is more legible). You see that in this case the creative work called “Chocolate Chocolate Chocolate” aligns with the Common Core Standard “CCSS.ELA-Literacy.RL.1.9 : Compare and contrast the adventures and experiences of characters in stories.”
Interestingly there is some other information given about the “degree of alignment”, i.e. how good a match that resource is to teaching that State Standard.
justification for the abstraction of the alignment object
In part the motivation for creating an alignment object class in schema.org was the issue mentioned above about not knowing what might be all the possible forms of alignment between a resource and an educational framework used to characterise some aspect of a teaching and learning scenario. However I hope the examples above go someway to showing that alignments are real (if intangible) things, you can give them URLs, and names if you want. Furthermore they do have properties. For example, they are asserted by someone: a student at Liverpool University in the kritikos example and a user of OER Commons in the other. In the OER Commons example there is other information about the degree of alignment. This goes some way to convincing me that the alignment object isn’t just some computer science trick of indirection.
A while back I summarised the input about semantics and academic coding that Lorna and I had made on behalf of Cetis for a study on possible reforms to JACS, the Joint Academic Coding System. That study has now been published.
JACS is mainatained by HESA (the Higher Education Statistics Agency) and UCAS (Universities and Colleges Admissions Service) as a means of classifying UK University courses by subject; it is also used by a number of other organisations for classification of other resources, for example teaching and learning resources. The report (with appendices) considers the varying requirements and uses of subject coding in HE and sets out options for the development of a replacement for JACS.
Of course, this is all only of glancing interest, until you realise that stuff like Unistats and the Key Information Set (KIS) are powered by JACS.
– See more at Followers of the apocalypse
If you’re not sure why this should interest you (and yet for some reason have read this far) David Kernohan has written what I can only describe as an appreciation of the report, Hit the road JACS, from which the quote above is taken.
The Learning Resource Metadata Initiative aimed to help people discover useful learning resources by adding to the schema.org ontology properties to describe educational characteristics of creative works. Well, as of the release of schema draft version 1.0a a couple of weeks ago, the LRMI properties are in the official schema.org ontology.
Schema.org represents two things: 1, an ontology for describing resources on the web, with a hierarchical set of resource types each with defined properties that relate to their characteristics and relationships with other things in the schema hierarchy; and 2, a syntax for embedding these into HTML pages–well, two syntaxes, microdata and RDFa lite. The important factor in schema.org is that it is backed by Google, Yahoo, Bing and Yandex, which should be useful for resource discovery. The inclusion of the LRMI properties means that you can now use schema.org to mark up your descriptions of the following characteristics of a creative work:
audience the educational audience for whom the resource was created, who might have educational roles such as teacher, learner, parent.
educational alignment an alignment to an established educational framework, for example a curriculum or frameworks of educational levels or competencies. Expressed through an abstract thing called an Alignment Object which allows a link to and description of the node in the framework to which the resource aligns, and specifies the nature of the alignment, which might be that the resource ‘assesses’, ‘teaches’ or ‘requires’ the knowledge/skills/competency to which the resource aligns or that it has the ‘textComplexity’, ‘readingLevel’, ‘educationalSubject’ or ‘educationLevel’ expressed by that node in the educational framework.
educational use a text description of purpose of the resource in education, for example assignment, group work.
interactivity type The predominant mode of learning supported by the learning resource. Acceptable values are ‘active’, ‘expositive’, or ‘mixed’.
is based on url A resource that was used in the creation of this resource. Useful for when a learning resource is a derivative of some other resource.
learning resource type The predominant type or kind characterizing the learning resource. For example, ‘presentation’, ‘handout’.
time required Approximate or typical time it takes to work with or through this learning resource for the typical intended target audience
typical age range The typical range of ages the content’s intended end user.
Of course, much of the other information one would want to provide about a learning resource (what it is about, who wrote it, who published it, when it was written/published, where it is available, what it costs) was already in schema.org.
Unfortunately one really important property suggested by LRMI hasn’t yet made the cut, that is useRightsURL, a link to the licence under which the resource may be used, for example the creative common licence under which is has been released. This was held back because of obvious overlaps with non-educational resources. The managers of schema.org want to make sure that there is a single solution that works across all domains.
Guides and tools
To promote the uptake of these properties, the Association of Educational Publishers has released two new user guides.
As the last two resources show, LRMI metadata is used by the Learning Registry and services built on it. For what it is worth, I am not sure that is a great example of its potential. For me the strong point of LRMI/schema.org is that it allows resource descriptions in human readable web pages to be interpreted as machine readable metadata, helping create services to find those pages; crucially the metadata is embedded in the web page in way that Google trusts because the values of the metadata are displayed to users. Take away the embedding in human readable pages, which is what seems to happen when used with the learning registry, and I am not sure there is much of an advantage for LRMI compared to other metadata schema,–though to be fair I’m not sure that there is any comparative disadvantage either, and the effect on uptake will be positive for both sides. Of course the Learning Registry is metadata agnostic, so having LRMI/schema.org metadata in there won’t get in the way of using other metadata schema.
Disclosure (or bragging)
I was lucky enough to be on the LRMI technical working group that helped make this happen. It makes me vary happy to see this progress.