I have posted details of my publications on this site. More interestingly, I did so with a plugin and theme extension which builds on previous work on presenting publications in WordPress and using WordPress as a semantic platform. These are useful steps towards WordPress as a lightweight repository. Continue reading
XCRI-CAP (eXchanging Course Related Information, Course Advertising Profile) is the UK standard for course marketing information in Higher Education. It is compatible with the European Standard Metadata for Learning Opportunities. The W3C schema course extension community group has developed terms for describing educational courses that are now part of schema.org. Here I look at translating the data from an XCRI-CAP xml feed to schema.org json-ld. Continue reading
I have a new publication: “Analysing and Improving Embedded Markup of Learning Resources on the Web,” which Stefan Dietze and Davide Taibi have presented at the 2017 International World Wide Web Conference in Perth Australia. I played a minor role in the “analysing” part of this work, the heavy lifting was done by my co-authors. They analysed data from the Common Crawl to identify sites that were using LRMI terms in their schema.org markup. The analysis provides answers to important questions such as: who is using LRMI metadata and which terms are they using? How many resources have been marked up with LRMI metadata? Are the numbers of users growing? What mistakes are being made in implementing LRMI? Continue reading
I summarised the presentations given at the LRMI workshop “Building on Schema.org to describe learning resources” at the Dublin Core conference in my previous post, this post is about the discussion part of the workshop. The discussion was organised along the lines of a balloon debate:
A balloon debate is a debate in which a number of speakers attempt to win the approval of an audience. The audience is invited to imagine that the speakers are flying in a hot-air balloon which is sinking and that someone must be thrown out if everyone is not to die. Each speaker has to make the case why they should not be thrown out of the balloon to save the remainder.
Our “balloon” is a metaphorical one, it stands for the volunteer effort that maintains LRMI (LRMI seeming somewhat like a balloon in that it is kept going by hot air), and instead of carrying people it is trying to carry out work that will help people describe learning resources. While it is by no means sinking, like a real balloon LRMI can only sustain a certain (work-)load, and so we need to be selective in the work which is taken on. This exercise was intended to provide LRMI with ideas about which work to prioritize, while providing those who take part with information about the future directions that are open to LRMI.
The debate proceeded through three rounds, the first round was about choosing candidate ideas for future work, the second eliminated those which seemed least feasible, and the third was to select the highest priority ideas. On the day we didn’t really get to the third round, but the results were nonetheless interesting.
We started with the following ideas, seeded by myself with help from other members of the LRMI task group (especially Stuart Sutton who provided several of the descriptions).
- 1. Structured, controlled vocabularies for LRMI properties.
- Several key LRMI properties take text for their expected value type. The use of free text for properties such as learningResourceType makes it difficult to compare data from different providers. Where there are suggested values mentioned in the LRMI spec for these, the LRMI Task Group is already working to provide definitions of terms in RDF (as SKOS Concepts). This suggestion is a continuation of this work, to try as far as possible to provide controlled vocabularies for relevant LRMI properties.
- 2. Drop the Alignment Object for the most common alignment types
- The mechanism of indicating how a resource relates to an educational framework involves a level of indirection which is both arcane and potentially powerful, however it’s full power is not fully developed. The suggestion here is that the indirection be removed by creating properties for those alignment types which are clearly important. So, for example, it is often important to state the educational level of a resource (e.g. in terms of Grade Level). Currently this would be an educationalAlignment with an AlignmentObject having alignmentType of educationalLevel. The indirection of the AlignmentObject could be removed if there were a property of a learning resource called educationalLevel referencing directly a point in a grade level framework.
Figure 1a: representing an educational alignment with an alignment object.
The alignment type property of the alignment object can set to specify the nature of the relationship, e.g. that this represents the educational level of the resource.
Figure 1b: a possible way of representing an alignment such as educational level of the resource more simply. Note: the value provided could be text or a URI; representing the relationship of the node to an educational framework is not yet solved in Schema.org.
- 3. Develop the Alignment Object to be more expressive
- The mechanism of indicating how a resource relates to an educational framework involves a level of indirection which is both arcane and potentially powerful, however it’s full power is not fully developed. The suggestion here is that properties be added to the AlignmentObject to allow additional information about the alignment between a resource and a point in an educational framework. This additional information may include factors such as: who asserts that the alignment holds, what evidence they have for this alignment, how good is the alignment.
- 4. Recommendations for referring to educational frameworks
- From the point of view of facilitating resource discovery the relationship between a resource and an educational framework is key. Showing how a resource relates to a framework which is understood by educators and learners allows them to find resource suitable for their needs. This may be manifest in discovery interfaces as faceted search or browse categories. One problem is identifying the relevant frameworks for different types of educational alignment in different educational contexts (e.g. what is the Scottish equivalent of K-12 for educational level?). Another problem is that variation in how these frameworks are expressed in LRMI metadata makes creation of such services more difficult than it need be (what is the framework name for K-12? What URIs are best to use?). We could help by creating an inventory of frequently used educational frameworks and recommendations on how to refer to them.
- 5. Declare a “Learning Resource” type
- Currently, a Learning Resource is not formally declared in the schema.org schema as a subtype of CreativeWork. Instead, it is inferred that a Learning Resource is a kind of CreativeWork since the LRMI properties were included as part of the CreativeWork type. The lack of an explicit LearningResource subtype to CreativeWork makes it difficult for some doing markup or implementing systems using schema.org to recognize that schema.org in fact supports description of learning resources. They see EducationEvent as a subtype of Event, but no LearningResource as a subtype of CreativeWork.
- 6. Define a minimal subset of schema.org for describing learning materials
- The schema.org schema is quite large and can be intimidating for some wanting to define minimally viable learning resource descriptions–e.g., where to begin, what properties are most important, how should they be used. Publishing a suggested profile of schema.org that defines a minimaly viable learning resource description while leaving open the addition of other properties needed with a particular use, might assist implementers needing a means to jumpstart development of their own profile based in schema.org.
- 7. Create support materials explaining LRMI properties
- Recently, examples of how to use the AlignmentObject as well as the LRMI properties of CreativeWork have been added as ‘footers’ to the relevant schema pages at schema.org. While an excellent beginning, these additions are not enough. Other types of support materials for describing learning resources using the LRMI properties and classes need to be defined, developed, and published. Such materials might include a one-stop “primer” that combines narrative with examples and covers the inevitable points in usage where subjective decisions must be made where there are alternative legitimate paths forward.
- 8. Collate information about existing LRMI implementations
- Specifications always need to be interpreted in order to be used in specific contexts, and however much normative and informative material is provided, we cannot hope to cover all contexts. One way that people implementing a specification can make choices that do not lead to unnecessary divergence is by referring to other implementations from similar contexts. LRMI could facilitate this by collating information about where these implementations can be found. This may also be useful in monitoring the uptake of the specification, identifying problems commonly encountered and providing examples of good practice.
- 9. Create an editor for LRMI metadata
- Several tools exist that can create LRMI metadata within the scope of a single project or service’s needs. What is suggested here is that LRMI create, assist or promote the creation of an editor that can serve as a reference implementation: independent of the choices that would need to be made for any use in practice but flexible enough to be tailored practical use and, importantly, illustrating good practice in the implementation of LRMI.
In the end we discussed and voted on the issues of which ideas seemed most appealing and then, after eliminating those ideas which were not popular, we discussed and voted on the ideas that seemed least feasible, then had a short discussion about the results. We kept a tally of the votes on a flip chart: green ticks for ideas we thought attractive, red ticks for those we thought least feasible.
The ideas most liked were:
- Structured, controlled vocabularies for LRMI properties,
- Define a minimal subset of schema.org for describing learning materials, and
- Collate information about existing LRMI implementations.
As was pointed out in the discussion, there are relationships between the ideas which mean that some other ideas become easier if these are achieved first (e.g. a general purpose editor becomes easier if you have agreed a general purpose subset of schema for it to edit).
Those ideas which we thought most difficult to achieve were
- Structured, controlled vocabularies for LRMI properties and
- Declare a “Learning Resource” type.
The specific issues with both of these centred around achieving agreement or consensus on terms and definitions.
I’m pretty sure that everyone involved in LRMI will see something in these results that they think a sensible outcome and something which raises an eyebrow or two. I’m also fairly sure that it won’t be the same things for everyone.. which is to say, the discussions will continue.
So what now?
Some of the ideas discussed are based on work that individuals or groups involved in LRMI are already scoping out. As a group we have made a start on formally describing some values for use with LRMI terms as SKOS concepts (see the DCMI/LRMI Github repository). I have had a masters student work on creating a subset of schema.org for learning resources. There is an editor for LRMI in the works, which looks like it could meet the sort of requirements described above. And as I described in the presentation I gave at the workshop, there is work describing how LRMI is used. So hopefully we will be able to make progress on some of these ideas.
As I said during the workshop, LRMI is a volunteer effort. People work on whichever aspects of it interests them. If you are interested in any of the ideas described above, or have some other ideas that you think would be valuable, please join us and I hope will be able to achieve more together.
I was in Copenhagen last week, at the Dublin Core Metadata Initiative 2016 conference, where I ran a workshop about LRMI entitled “Building on Schema.org to describe learning resources” (as one of my colleagues pointed out, thinking of the snappy title never quite happened). Here’s a quick overview of it.
There were three broad parts to the workshop: presentations on the background organisations and technology; presentations on how LRMI is being used; and a debate where attendees got to think about what could be next for LRMI.
Stefan Dietze invited me to give the keynote presentation at the pre-WWW2015 workshop Linked Learning 2015 in Florence. I’ve already posted a summary of a few of the other presentations I saw, this is a long account (from my speaker’s notes) of what I said. If you prefer you can read the abstract of my talk from the WWW2015 Companion Proceedings or look through my slides & notes on Google. This is a summary of past work at Cetis that lead to our invovement with LRMI, why we got involved, and the current status of LRMI. There’s little here that I haven’t written about before, but I think this is the first time I’ve pulled it all together in this way.
Lorna M. Campbell was co-author for the presentation; the approach I take draws heavily on her presentation from the Cetis conference session on LRMI. Most of the work that we have done on LRMI has been through our affiliation with Cetis. I’ll describe LRMI and what it has achieved presently. In general for this I want to keep things fairly basic. I don’t want to assume a great deal of knowledge about the educational standards or the specifications on which LRMI is based, not so much because I think you don’t know anything, but because, firstly I want to show what LRMI drew on, but also because whenever I talk to people about LRMI it becomes clear that different people have different starting assumptions. I want to try to make sure that we kind of align our assumptions.
LRMI prehistory and precursors
I want to start by reviewing some of what we (Lorna and I and Cetis) did before LRMI and why we got involved in it.
That means talking about metadata. Mostly metadata for the purpose of resource discovery, in order to support the reuse of educational content; we want to support the reuse of educational content in order to justify greater effort going in to the creation of better content and allowing teachers to focus on designing better educational activities. We were never interested in metadata just for its own sake, but, we felt that however good an educational resource is, if you can’t find it you can’t use it.
And we can start with the LOM, the first international standard for educational technology, designed in the late 1990’s, completed in 2002 (at least the information model was–the XML binding came a couple of years later; other serializations such as RDF were never successfully completed)
We had nothing to do with designing the LOM.
But we did promote its use, for example:
- I worked on a project called FAILTE, a resource discovery service for Engineering learning resources, which involved people with various expertise (librarians, engineering educators, learning technologists) creating what was essentially a catalogue of of LOM records.
- I was also involved in a wider initiative to facilitate similar services across all of UK HE, by creating an application profile for use by joint projects of two organisations, RDN & LTSN (RLLOMAP)
- Meanwhile Lorna was leading work to create an application profile of the LOM with UK-wide applicability (UK-LOM core)
These were fairly typical examples of LOM implementation work at that time. Also, none of them still exists.
All these involve application profiles, that is tailoring the LOM by recommending a subset of elements to use and specifying what controlled vocabularies to use to provide values for them (see metadata principles and practicalities, section IIIA). And there’s a dilemma there, or at least you have to make a compromise, between creating descriptions which make sense in a local setting and meet local needs, and getting interoperability between different implementations of the LOM.
In fact some of the initial LRMI research was a survey of how the LOM is used, looking at LOM records being exposed through OAI-PMH found that most LOM records provided very little beyond what could be provided with simple Dublin Core elements, which agreed with previous work comparing different application profiles (e.g. Goodby, 2004). (See also a similar study by Ochoa et al (2011) conducted at about the same time which focussed repositories that had been designed to use the LOM.)
Anyway, one way or another we got a lot of experience in the LOM and Lorna and I were also part of the team that created the IMS learning resource meta-data specification, especially the IMS Meta-data Best Practice Guide for IEEE 1484.12.1-2002 Standard for Learning Object Metadata, 2006 which is basically a set of guidelines for how to use the LOM.
But I wasn’t talking about the LOM in Florence. Why not? Well, IEEE LOM and IMS Metadata have their uses, and if they work for you that’s great. But I’ve also mentioned some of the problems that we faced when we tried to implement the LOM in more or less open systems: lots of effort to create each record, compromise between interoperability and addressing specific requirements. The structure of the LOM as a single XML tree-like metadata record comprising all the information about a resource does little to help you get around these problems. It also means that the standard as a whole is monolithic: the designers of the LOM had to solve the problems of how to describe IPR, technical, lifecycle issues, and others (then consider that many different resource types can be used as learning resources, and what works a technical description of a text document might not work for an image or video). Solving how to describe educational properties is quite hard enough without throwing solutions to all of these others into the same standard.
So, having learnt a lot from the LOM, we moved on hoping to find approaches to learning resource description that disaggregated the problem (at both design and implementation stages) into smaller less intimidation tasks.
I want to mention some work on Semantic technologies and what was then beginning to be called linked data that Cetis helped commission and were involved in through a working group aournd 2008 – 2009. The Semantic Technologies in Learning and Teaching Jisc mini-project / Cetis working group run by Thanassis Tiropanis et al out of the University of Southampton. The SemTech project aimed to raise the profile of semantic technologies in HE, to highlight what problems they were good at solving. The project included a survey of then-existing semantic tools & services used for education to discover what they were being used for. (they found 36, using a fairly loose definition of “semantic”.
The “five year plan” outlined by that project is worth reflecting on. Basically it suggested that exposing more data which can be used by applications, thus encouraging more data to be released (a sort of optimistic virtuous cycle), and the development of shared ontologies which yield benefits when there you have large amounts of data (Notably, it didn’t suggest spending years locked in a room coming up with the one ontology that would encompass everything before releasing any data).
The development of semantic applications for teaching and learning for HE/FE over the next 5 years could be supported in a number of steps:
- Encouraging the exposure of HE/FE repositories, VLEs, databases and existing Web 2.0 lightweight knowledge models in linked data formats. Enabling the development of learning and teaching applications that make use of linked data across HE/FE institutions; there is significant activity on linked open data to be considered
- Enabling the deployment of semantic-based searching and matching services to enhance learning. Such applications could support group formation and learning resource recommendation based on linked data. The development of ontologies to which linked data will be matched is anticipated. The specification of patterns of semantic tools and services using linked data could be fostered
- Collaborative ontology building and reasoning for pedagogical ends will be more valuable if deployed over a large volume of education related linked data where the value of searching and matching is sufficiently demonstrated. Pedagogy-aware applications making use reasoning to establish learning context and to support argumentation and critical thinking over a large linked data field could be encouraged at this stage.
Our first efforts outside of IEEE LOM were in the Dublin Core Education Application Profile Task Group , between about 2006-2011, attempting to work on a shared ontology. Meanwhile others (notably Mikael Nilsson, KTH Royal Inst Technology, Stockholm) worked to get LOM data in RDF. This work kind of fizzled out, but we did get an idea of a domain model for learning resources, which rather usefully separated the educationally relevant properties from all the others. The cloud in the middle represents any resource-type specific domain model (say one for describing videos or one for describing textual resources) to which educationally relevant properties can be added. So this diagram represents what I was saying earlier about wanting to disaggregate the problem space so that we can focus on educational matters while other domain experts focus on their specialisms.
I want to mention in passing that around this time (2008/9) work started at ISO/IEC on semantic representation of metadata for learning resources. This was kicked off in response to the IEEE LOM being submitted for ratification as an ISO standard… and it is still ongoing. We’re not involved. Cetis has done no more than comment once or twice on this work.
In fact we did very little metadata work for a while. I thought I was done with it.
At this time there was there was a an idea in educational technology circles that was encapsulated in the term #eduPunk, the idea was that lightweight personal technology could be used to support teaching and learning, a sort DIY approach to learning technology, without the constraints of large institutional, enterprise level systems–WordPress instead of the VLE, folksonomies instead of taxonomies.
In comparison to eduPunk, we were #eduProg. I’ve nothing against the virtuoso wizardry of ProgRock or a technically excellent OWL ontology, and I am not saying there is anything wrong in either. The point I am trying to make is that the interest and attention, the engagement from the Ed Tech community was not in EduProg.
The attention and engagement was in Open Educational Resources, and we supported a UK HE 3 year, £15Million programme around the release of HE resources under creative commons licences [UKOER]. Cetis provided strategic technical advice and support to the funder and to the 66 projects that released over 10,000 resources. The support include guidelines on technology approaches to the management, description and dissemination of OERs; the guidelines we gave were for lightweight dissemination technologies, minimal metadata, and putting resources where they could be found. We reflected at length on the technology approaches taken by this programme in our book Into the wild – Technology for open educational resources. We recognise the shortcomings in this approach, it’s not perfect, and some people were quite critical of it. If we had been able to point to any discovery services that were based on the LOM or any more directed approach that were unarguably successful we would have recommended it, but it seemed that Google and the open web was at least as successful as any other approach and required less effort on the part of the projects. Partly through UKOER we did see 10,000 resources and more importantly a change in culture so that using social sharing site for education became unremarkable, an I would rather have that than a few 100 perfect metadata descriptions in a repository.
As far as resource description and resource discovery is concerned I think the most important advice we gave was this:
LRMI launched in 2011. what about it got us back into educational metadata? Let’s start from first principles, and look at the motivation behind LRMI, which is to help people find resources to meet their specific needs. I’ll try to illustrate this.
Meet Pam, a school teacher. Let’s say she wants to teach a lesson about the Declaration of Arbroath.
What are her specific needs? Well, they relate to her students: their age, their abilities; to her teaching scenario: is she looking for something to use as part of a half hour lesson on a wider topic, or something that will provide a plan of work over a few lessons? introduction or revision? And there is the wider context, she’s unlikely to be teaching about the declaration of Arbroath for its own sake, more likely it will relate to some aspect of a wider curriculum, perhaps history but perhaps also something around civic engagement in Scotland, or relations between Scotland and England, or precursors to the US declaration of independence, but she will be doing so because she is following some shared curriculum or exam syllabus.
She searches Google, finds lots of resources, many of them are no more than the text of the resource.
There are also tea towels and posters.
Those that go further do not necessarily do so in a way that is suitable for her pupils. There’s a Wikipedia article but that’s not really written with school children in mind. Google doesn’t really support narrowing down Pam’s search to match her requirements such as the age and educational level of students, the time required to use in a lesson, the relevance to requirements of national curriculum or exam syllabus, so Pam is forced to look at a series of separate search services based (often) on siloed metadata [examples 1, 2, 3]. It’s worth noting that the examples show categorisation by factors such as Key Stage (i.e. educational level in the English National Curriculum), educational subject, intended educational use (e.g. revision) and others, giving hints as to what Pam might use to filter her search. Google (historically) hasn’t been especially good at this sort of filtering, partly because it cannot always work out the relevance of the text in a document.
What happened to make us think that it was worth addressing this problem was schema.org:
a joint effort, in the spirit of sitemaps.org, to improve the web by creating a structured data markup schema supported by major search engines.
- An agreed hierarchy of entity types (see right).
- An agreed vocabulary for naming the characteristics of resources and the relationships between them.
- Which can be added to HTML (as microdata, RDFa or JSON-LD) to help computers understand what the strings of text mean.
Adding schema.org markup (as microdata) to HTML, turns the code behind a web page from something like:
<h1>Learning Resource Metadata Initiative: using schema.org to describe open educational resources</h1> <p>by Phil Barker, Cetis, School of Mathematical and Computer Sciences, Heriot-Watt University <br /> Lorna M Campbell, Cetis, Institute for Educational Cybernetics, University of Bolton. April 2014</p>
i.e. just strings, not much to hint as to which string is the authors name, which string is the title of the paper, which string is the author’s affiliation. to something like
<div itemscope itemtype="http://schema.org/ScholarlyArticle"> <h1 itemprop="name">Learning Resource Metadata Initiative: using schema.org to describe open educational resources</h1> <p itemprop="author" itemscope itemtype="http://schema.org/Person"> <span itemprop="name">Phil Barker</span>, <span itemprop="affiliation">Cetis, School of Mathematical and Computer Sciences, Heriot-Watt University</span></p> <p itemprop="author" itemscope itemtype="http://schema.org/Person"> <span itemprop="name">Lorna M Campbell</span>, <span itemprop="affiliation">Cetis, Institute for Educational Cybernetics, University of Bolton</span></p> </div>
where the main entities and their relationships are marked and text that related to properties of those items is identified: a Scholarly Article is related to two Persons who are the authors; some of the text is the name of the Scholarly Article (i.e. its title), the names of the Persons and their affiliations. Represented graphically, we could show this information as:
An entity – relation graph identifying the types of entities, their relationships to each other and to the strings that describe significant properties.
At this point the LRMI was initiated, a 3 year project funded by the Bill and Melinda Gates foundation (and later wth some additional funding from the Hewlett Foundation), managed jointly by one organisation committed to open education (Creative Commons) and another (AEP) from the commercial publishing world, with input from education,publishers and metadata experts.
I was on the technical working group. We issued a call for participation; gathered use cases; and did the usual meeting and discussing to hammer out how to meet those use cases. We worked more or less in the open,–there was an invitation only face to face meeting near the beginning (limited funding so couldn’t invite everyone) after that all the work was on open email discussion lists and conference calls. Basing the work on schema.org allowed us to leave all the generic and resource-format specific stuff for other people to handle, and we could focus just on the educational properties that we needed.
The slide on the left shows what came out. The first two properties are major relationships to other entities, and alignment to some educational framework and the intended audience, the others are mostly simple characteristics. All are defined in the LRMI specification. In a previous blog post I have attempted further explanation of the Alignment Object. Most of these were added to Schema.org in 2013, the link to licence information was added later.
Current state of LRMI and future plans.
LRMI has been implemented by a number of organisations, some with project funding to encourage uptake, others more organically. One of the nice things about piggy-backing on schema.org is that people who have never heard of LRMI are using it.
Not every organisation on this list exposes LRMI metadata in its webpages, some harvest it or create it and use it internally. The Learning Registry is especially interesting as it is a data store of information about learning resources in many different schema, which uses LRMI as JSON-LD for (many of its) descriptive records. We have reported in some depth on the various ways in which LRMI has been implemented by those projects who are funded through the initiative.
We can create a Google custom search engine that looks for the alignment object–this in itself is a good indicator that someone has considered the resource to be useful for learning; and we can add filters to find learning resources useful for specific contexts, in this case different educational levels. This helps Pam narrow down her search–at least in a proof of concept way, as they stand these are not intended to be useful services.
I would like to note the following points from these implementations:
- they exist. That’s a good first step.
- not every implementation exposes LRMI metadata, some use it internally.
- schema.org is a lightweight, loose ontology, implementation is looser.
The people implementing it tend to make mistakes. Expect to find strings where there should be a link and also find links and properties where they shouldn’t be. (see also Martin Hepp’s presentation from Ontologies to Web Ontologies and the Bulletin of the Association for Information Science and Technology Vol. 41 No 4, “the charm of weak semantics”)
- there is no agreement on value spaces, either terms or meanings (e.g. educational level, 1st Grade, Primary 1).
The Gates funding for LRMI is now complete, and as an organization LRMI is now a task group of the Dublin Core Metadata Initiative. That provides us with with the mechanisms and governance required to maintain, promote, and if necessary extend the specification. It does not mean that LRMI terms are DC terms, they’re not, they’re in a different namespace. DCMI is more than a set of RDF terms, it’s a community of experts working together, and that’s what LRMI is part of. The LRMI specification is now a community specification of DCMI, conforming to the the requirements of DCMI, such as having well-maintained definitions in RDF, which align with the schema.org definitions but are independently extensible.
The planned work of task group is shown on the group wiki, and includes:
- Extending LRMI: Events? Courses?
- contributing via new schema.org extension mechanism?
- Recommended value vocabularies
- Linked data representation of educational frameworks (alignment)
(There’s also a background interest in the use of LRMI beyond the original schema.org scenario, for example as stand-alone JSON-LD or as EPUB metadata for eTextBooks)
It’s customary to allow time for the audience to ask difficult questions of the presenter. I tried to forestall that by asking the audience’s opinion on these questions:
- Does this help with the endeavour to expose lightweight linked data?
- (can you get the data out of web pages?)
- How do we encourage linked data representation of educational frameworks?
- How much goes into schema.org (or similar) or should we just reuse existing ontologies?
- Can you cope with the the quality of data that can be provided at web-scale?
Reflections on the presentation
As far as I could judge from the questions that I couldn’t answer well, the weak points in the presentation, or in LRMI may be, seem to be around gauging the level of uptake: how many pages are there out there with LRMI data on them? I don’t know. The schema.org pages for each entity show usage , for example the Alignment Object is on between 10 and 100 domains, but I do not know the size of those domains. That also misses those services that use LRMI and do not expose it in their webpages but would expose it as linked data in some other format. I suspect uptake is less than I would like, and I would like to see more.
As presenter I was happy that even after I had talked about all that for about 45 minutes, there were people who wanted to ask me questions (the forestalling tactic didn’t work), and even after that there were people who wanted to talk to me instead of going for coffee. That seems to be a good indicator that there was interest from the workshop’s audience.
Image credits: Photo of Pam Robertson, teacher, by Vgrigas (Own work) [CC-BY-SA-3.0 ], via Wikimedia Commons; reproduction of Tyninghame (1320 A.D) copy of the Declaration of Arbroath, 1320, via Wikimedia Commons. Logos (Heriot-Watt, Cetis, LRMI, Semtech etc.) are property of the respective organisation. Unless noted otherwise on slide image, other images created by the authors and licensed as CC-BY.
At last, it is official: “effective October 23, 2014, leadership and governance of the Learning Resource Metadata Initiative (LRMI) […] have transferred from the Association of Educational Publishers and Creative Commons to the Dublin Core Metadata Initiative (DCMI).”
This represents the end of LRMI as a project, and the start of it as a member of the family of stable-but-evolving educational metadata specifications, one which is maintained under the governance protocols of DCMI. This does not mean that LRMI is being merged in some way with Dublin Core’s well known metadata element set or terms; DCMI is more than its specifications, it is (as it says on its website banner, with my emphasis) a metadata community “supporting innovation in metadata design, implementation and best practices”. The longstanding high regard with which the DC Metadata Element Set and DC Terms are held is testament to the care and expertise that the DCMI community devotes to specification development and curation. LRMI will now benefit from that same care and expertise. It will also benefit from representation among other educational technology and metadata specification and standardization bodies, for example through the Digital Learning Metadata Alliance.
This does not mean that activity and development of LRMI will stop. It won’t (I hope), but new developments will be subject to careful scrutiny and well governed procedures leading to community acceptance. The scope and pace of new developments won’t be prescribed by project proposals, workpackages and deadlines agreed with funders, they will depend on people articulating needs and solutions that they are prepared to put effort into developing. The funding was necessary to get the initiative up and running, the funders have been great in allowing the project to be responsive and agile in ascertaining and meeting the needs of key stakeholders. The funders never dictated how the project met its aims. The project would not have reached this successful end point had that not been the case. For any initiative to continue after the initial project funding ends is a challenge, but I believe LRMI as part of DCMI is well placed to meet that challenge, now onus is on the community who use LRMI to take it forward by defining new areas for building consensus within the DCMI governance framework and by bringing new resources to this effort.
Transfer to DCMI does not mean the end of the involvement of many of the individuals who have worked on LRMI in the past. The task of setting up LRMI within DCMI is being undertaken by a task group that includes Lorna M Campbell and me, extending the work that Cetis were funded to undertake in phase 3 of the funded project, Stuart Sutton, Dan Brickley, Michael Jay and Steve Midgley all of who have been active throughout the project. Beyond that, one of the reasons why DCMI was felt to be an appropriate home for LRMI is the fact that there are no barriers put in the way of anyone who wants to contribute to their work on specifications, so I hope that we will see many others who have contributed to LRMI in the past joining us.
Many thanks to my colleagues at the Association of Educational Publishers, Creative Commons and the Dublin Core Metadata Initiative who have made this possible.
One of the issues around eTextBooks is how to describe them, specifically by way of educational metadata in ePub. That’s something that on the face of it shouldn’t be too difficult to address (at least to the extent that we know how to describe any educational resource). One thing that would be useful in demonstrating different choices for educational metadata is an app or tool that will display any metadata found in the ePub package in a sensible way. As a bit of long shot I tried four eBook readers to see whether they would; they don’t. The details follow, if you’re interested, but do let me know if you know of any tool that might be useful.
The package metadata of an ePub can include a selection of Dublin Core elements and terms. These can be refined, for example you may have two dc:title elements with refinements to specify that one is the main title and the other the subtitle. You can also extend with elements from other XML namespaces, or if you prefer you can just link to a metadata record of your favourite flavour which can be either inside the ePub package or elsewhere on the web. Any of this metadata can relate to the eBook as a whole or some part of it, e.g. a single chapter or image. Without going into details there seems to be enough scope there to experiment with how educational characteristics of the eBook might be described.
But how to see the results? I took an ePub (a copy of O’reily’s EPUB 3 Best Practices, since it seemed likely to provide as good a starting point as I was going to find in a real book), made a copy, unzipped it and changed the values of the meta elements so that I could easily identify what elements were being displayed. For example I changed
<dc:title id="pub-title">EPUB 3 Best Practices</dc:title> to
<dc:title id="pub-title">dc:title</dc:title> and so on.
Here’s a list of the metadata elements in that file:
<dc:creator id="..." >
<meta id="meta-identifier" property="dcterms:identifier">
<meta property="dcterms:title" id="meta-title">
<meta property="dcterms:language" id="meta-language">
<meta id="...." property="dcterms:creator">(repeated, different ids)
<meta name="cover" content="cover-image"/>
I then looked at this with various eBook readers:
I had hopes for Readium since it is pretty much the reference implementation of EPUB3. It displayed
- meta dcterms:modified
Note that it doesn’t even check for a valid value for dates.
Calibre, while it doesn’t claim to support ePub3 is targetted at managing personal book libraries. It displays:
- dc:subject (for tags)
It probably uses dc:language and dc:date (for published) as well but recognises that the values dc:language / dc:date aren’t valid.
Ideal Reader for Android
The Ideal Reader for Android is the other ePub3 reader I use. It displays
- meta dcterms:creator (just one of them)
Finally I gave a chunk of diskspace to Apple
- dc:title (again)
- dc:subject (in the info tab, as Genre)
Yep, title is there twice: the info tab shows dc:title in the Name and Album fields, so you can gauge the amount of effort that Apple have put into adapting iTunes for books.
What did I learn?
I learnt that none of the ePub reading/management apps or tools that I have show more than the bare minimum of metadata, even if it is there. None of them will be much good for trying out ideas for how educationally characteristics can be described since I strongly suspect that none of it will be viewable. That’s not too surprising, especially when you consider that none of the tools I looked at are geared around resource discovery, but I can’t really go uploading dummy ePub files to book seller sites just to see what they look like. May be any meaningful exploration/demonstration of educational metadata in ePub is going to need a bespoke application, but if you know of a tool that might be helpful do drop me a line.