Learning Resource Types in LRMI

It took a while, but we now have in LRMI a Learning Resource Type concept scheme that defines a controlled vocabulary of terms that you might use to describe the type or nature of a learning resource.

Why it took a while: what is a learning resource, and what is a type?

Aside from everything in metadata being harder than you first think, and having less time than we would like, the main reason it took so long (and it took, like, a few years) comes down to aspects of what is a learning resource? and what it a type? Some folk maintain that there is no well-defined class of “learning resources”, anything can be used for learning and teaching, and that trying to describe different sub-types of “anything” is going to be a fruitless task. Pedagogically, I have no argument with the statement that anything can be used for learning and teaching, but for information systems that is not a useful starting point. I have seen repositories crash and burn because they took that as their collection policy. Telling people who are looking for resources to help them learn maths that they can use anything, just be imaginative in how you use it, is not helpful.

By way of analogy, pretty much anything can be used as a hammer. Somethings will be better than others, the ones that are the right weight, hard and not brittle, but I’ve used stones, shoes, lumps of wood, monkey wrenches and so on as hammers with some success.  That doesn’t mean that “hammer” doesn’t exist as a category, not does it mean that it isn’t useful to distinguish a peen hammer from a sledgehammer from a copper-headed mallet. Not that I am easily distracted, but I have found plenty of shops that not only sell hammers as a distinct type of tool but they have a fascinating array of different types of specialist hammers.

Our first step to resolving this discussion was a couple of years back when we agreed to define a class of Learning Resource as:

A persistent resource that has one or more physical or digital representations, and that explicitly involves, specifies or entails a learning activity or learning experience.

So, not just anything that can be used for learning and teaching, but something that is meant to be used for learning and teaching.

Intuitively it seems clear that there are different types of learning resource: lesson plans are different to textbooks, video lectures are different to assessment items. But how to encapsulate that? Is something an assessment because it is used for assessment, or is there something inherent in some resources that makes them assessments? Likewise is a video lecture a different type of thing from a lecture or just a different format? The answer in each case is sometimes yes to both. The use of something may be strongly correlated to what it is but use and type are still distinct. That is fine: we have in LRMI a property of educationalUse, which can be assessment, and now learningResourceType which can also be assessment. Likewise, the format of something may be correlated to what it is: textbooks tend to include text; a video recording of a lecture will be a video. Again that is fine, we have mime types and other ways of encoding format to convey that information, but they won’t tell you whether something is a text book or a children’s picture book, and not all recordings of lectures will be videos. So learning resource type may be correlated to to educational use and format without being the same.

Principles adopted for the LRMI Learning Resource Type vocabulary

As with all our work in LRMI, we adopted a couple of principles. First it should focus solely on what was relevant to learning, education and training: other vocabularies deal well with other domains or generic terms. Second, create small vocabulary of broad, a high level terms to which other people can map their special cases and similar terms: those special cases are often so context dependent that they frequently don’t travel well. Related to both of these, we mapped our vocabulary to terms in two others: CEDS and the Library of Congress Genre/Form terms. The links to CEDS terms are useful because CEDS is well established in the US system, and we provided pre-existing terms many of which we adopted. The link to the LoC terms is useful because it links our terms into a comprehensive list of generic terms. LoC terms are an example of one of the vocabularies that you might want to use if you are describing things like data as learning resources: we don’t cover it because data as a learning resource is not distinct from data in general, but we are all linked data here, and when providing resource descriptions you can mix terms from our scheme with those from others.

Using the LRMI Learning Resource Type vocabulary

The vocabulary is expressed in SKOS, and so it ready for linked data use.

If you manage your own list of learning resource types using SKOS, we invite you create links to the LRMI concepts and thus improve interoperability of learning resource descriptions. We would be interested in hearing from you if you are in this situation. Perhaps you have suggestions for further concepts; you can raise an issue about the concept scheme if that is the case.

If you create learning resource descriptions you may reference this vocabulary in several ways, for example in JSON-LD you may have:

  { "@context": {
      "lrmi"    : "http://purl.org/dcx/lrmi-terms/",
      "lrt"     : "http://purl.org/dcx/lrmi-vocabs/learningResourceType/",
      "dcterms" : "http://purl.org/dc/terms/",
      "lrmi:learningResourceType" : {
         "@type": "@id"
    "@type": "lrmi:LearningResouce",
    "@id": "http://example.edu/textbooks/0001",
    "dcterms:name": "Example Textbook",
     "lrmi:learningResourceType": "http://purl.org/dcx/lrmi-vocabs/learningResourceType/textbook"

Or, if you don’t want to rely on consumers understanding/dereferencing that URI to get vital information, you may prefer:

{ "@context": {
    "lrmi" : "http://purl.org/dcx/lrmi-terms/",
    "lrt" : "http://purl.org/dcx/lrmi-vocabs/learningResourceType/",
    "dcterms" : "http://purl.org/dc/terms/",
    "skos" : "http://www.w3.org/2004/02/skos/core#"
  "@type": "lrmi:LearningResouce",
  "@id": "http://example.edu/textbooks/0001",
  "dcterms:name": "Example Textbook",
  "lrmi:learningResourceType": {
    "@id": "http://purl.org/dcx/lrmi-vocabs/learningResourceType/textbook",
    "skos:prefLabel": "textbook"

In schema.org, you may use the labels defined as simple string values, but you could include a link to our full definition (and hence provide access the links to other schemes that we define — after all this is linked data), by using a defined term as the value for learning resource type

{ "@context": "https://schema.org/",
  "@type": "LearningResouce",
  "@id": "http://example.edu/textbooks/0001",
  "name": "Example Textbook",
  "learningResourceType": {
    "@type": "DefinedTerm",
    "@id": "http://purl.org/dcx/lrmi-vocabs/learningResourceType/textbook",
    "name": "textbook"

Graphical Application Profiles?

In this post I outline how a graphical representation of an application profile can be converted to SHACL that can be used for data validation.

My last few posts have been about work I have been doing with the Dublin Core Application Profiles Interest Group on Tabular Application Profiles (TAPs). In introducing TAPs I described them as “a human-friendly approach that also lends itself to machine processing”. The human readability comes from the tabular format, and the use of a defined CSV structure makes this machine processable. I’ve illustrated the machine processability through a python program, tap2shacl.py, that will convert a TAP into SHACL that can be used to validate instance data against the application profile, and I’ve shown that this works with a simple application profile and a real-world application profile based on DCAT. Once you get to these larger application profiles the tabular view is useful but a graphical representation is also great for providing an overview. For example here’s the graphic of the DCAT AP:

Source: join-up DCAT AP

Mind the GAP

I’ve long wondered whether it would be possible to convert the source for a graphical representation of an application profile (let’s call it a GAP) into one of the machine readable RDF formats. That boils down to processing the native format of the diagram file or any export from the graphics package used to create it. So I’ve routinely been looking for any chance of that whenever I come across a new diagramming tool. The breakthrough came when I noticed that lucid chart allows CSV export. After some exploration this is what I came up with.

As diagramming software, what Lucid chart does is quite familiar from Visio, yEd, diagrams.net and the like: it allows you to produce diagrams like the one below, of the (very) simple book application profile that we use in the DC Application Profiles Interest Group for testing:

two boxes, one representing data about a book, the other data about a person, joined by an arrow representing the author relationship. Lots of further detail about the book an author data is provided in the boxes, as discussed in the text of the blog post.

One distinctive feature of Lucid chart is that as well as just entering text directly into fields in the diagram, you can enter it into a data form associated with any object in the diagram, as shown below, first for the page and then for the shape representing the Author:

A screen shot of the Lucid Chart software showing the page and page data

A screen shot of the Lucid Chart software showing the Auhtor Shape and the data for it.

In the latter shot especially you can see the placeholder brackets [] in the AuthorShape object into which the values from the custom data form are put for display. Custom data can be associated with the document as a whole, any page in it and any shape (boxes, arrows etc) on the page;  you can create templates for shapes so that all shapes from a given template have the same custom data fields.

I chose a template for to represent Node Shapes (in the SHACL/ShEx sense, which become actual shapes in the diagram) that had the the following data:

  • name and expected RDF type in the top section;
  • information about the node shape, such as label, target, closure, severity in the middle section; and,
  • a list of the properties that have the range Literal is entered directly in the lower section (i.e. these don’t come from the custom data form).

Properties that have a range of BNode or URI are represented as arrows.

By using a structured string for Literal valued properties, and by adding information about the application profile and namespace prefixes and their URIs into the sheet custom data, I was able to enter most of the data needed for a simple application profile. The main shortcomings are that format for Literal valued properties is limited, and that complex  constraints such as alternatives (such as: use this Literal valued property or that URI property depending on …) cannot be dealt with.

The key to the magic is that on export as CSV, each page, shape and arrow gets a row, and there is a column for the default text areas and for the custom data (whether or not the latter is displayed). It’s an ugly, sparsely populated table, you can see a copy in github, but I can read it into a python Dict structure using python’s standard CSV module.


When I created the TAP2SHACL program I aimed to do so in a very modular way: there is one module for the central application profile python classes, another to read csv files and convert them into those python classes, another to convert the python classes into SHACL and output them; so tap2shacl.py is just a wrapper that provide a user interface to those classes. That approach paid off here because having read the CSV file exported from lucid chart all I had to do was create a module to convert it into the python AP classes and then I could use AP2SHACL to get the output. That conversion was fairly straightforward, mostly just tedious if ...  else statements to parse the values from the data export. I did this in a Jupyter Notebook so that I could interact more easily with the data, that notebook is in github.

Here’s the SHACL generated from the graphic for the simple book ap, above:

# SHACL generated by python AP to shacl converter
@base <http://example.org/> .
@prefix dct: <http://purl.org/dc/terms/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix sdo: <https://schema.org/> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

<BookShape> a sh:NodeShape ;
    sh:class sdo:Book ;
    sh:closed true ;
    sh:description "Shape for describing books"@en ;
    sh:name "Book"@en ;
    sh:property <bookshapeAuthor>,
        <bookshapeTitle> ;
    sh:targetClass sdo:Book .

<AuthorShape> a sh:NodeShape ;
    sh:class foaf:Person ;
    sh:closed false ;
    sh:description "Shape for describing authors"@en ;
    sh:name "Author"@en ;
    sh:property <authorshapeFamilyname>,
        <authorshapeGivenname> ;
    sh:targetObjectsOf dct:creator .

<authorshapeFamilyname> a sh:PropertyShape ;
    sh:datatype xsd:string ;
    sh:maxCount 1 ;
    sh:minCount 1 ;
    sh:name "Family name"@en ;
    sh:nodeKind sh:Literal ;
    sh:path foaf:familyName .

<authorshapeGivenname> a sh:PropertyShape ;
    sh:datatype xsd:string ;
    sh:maxCount 1 ;
    sh:minCount 1 ;
    sh:name "Given name"@en ;
    sh:nodeKind sh:Literal ;
    sh:path foaf:givenName .

<bookshapeAuthor> a sh:PropertyShape ;
    sh:minCount 1 ;
    sh:name "author"@en ;
    sh:node <AuthorShape> ;
    sh:nodeKind sh:IRI ;
    sh:path dct:creator .

<bookshapeISBN> a sh:PropertyShape ;
    sh:datatype xsd:string ;
    sh:name "ISBN"@en ;
    sh:nodeKind sh:Literal ;
    sh:path sdo:isbn .

<bookshapeTitle> a sh:PropertyShape ;
    sh:datatype rdf:langString ;
    sh:maxCount 1 ;
    sh:minCount 1 ;
    sh:name "Title"@en ;
    sh:nodeKind sh:Literal ;
    sh:path dct:title .

I haven’t tested this as thoroughly as the work on TAPs. The SHACL is valid, and as far as I can see it works as expected on the test instances I have for the simple book ap (though slight variations in the rules represented somehow crept in). I’m sure there will be ways of triggering exceptions in the code, or getting it to generate invalid SHACL, but for now, as a proof of concept, I think it’s pretty cool.

What next?

Well, I’m still using TAPs for some complex application profile / standards work. As it stands I don’t think I could express all the conditions that often arise in an application profile in an easily managed graphical form. Perhaps there is a way forward by generating a tap from a diagram and then adding further rules, but then I would worry about version management if one was altered and not the other. I’m also concerned about tying this work to one commercial diagramming tool, over which I have no real control. I’m pretty sure that there is something in the GAP+TAP approach, but it would need tighter integration between the graphical and tabular representations.

I also want to explore generating other outputs that SHACL from TAPs (and graphical representations). I see a need to generate JSON-LD context files for application profiles, we should try getting ShEx from TAPs, and I have already done a little experimenting with generating RDF-Schema from Lucid Chart diagrams.

DCAT AP DC TAP: a grown up example of TAP to SHACL

I’ve described a couple of short “toy” examples as proof of concept of turning a Dublin Core Application Profile (DC TAP) into SHACL in order to validate instance data: the SHACL Person Example and a Simple Book Example; now it is time to see how the approach fares against a real world example. I chose the EU joinup Data Catalog Application Profile (DCAT AP) because Karen Coyle had an interest in DCAT, it is well documented (pdf) with a github repo that has SHACL files, there is a Interoperability Test Bed validator for it (albeit a version late) and I found a few test instances with known errors (again a little dated). I also found the acronym soup of DCAT AP DC TAP irresistable.
Continue reading

TAP to SHACL example

Last week I posted Application Profile to Validation with TAP to SHACL about converting a DCMI Tabular Application Profile (DC TAP) to SHACL in order to validate instance data. I ended by saying that I needed more examples in order to test that it worked: that is, not only check that the SHACL is valid, but also that validates / raises errors as expected when used with instance data.
Continue reading

Application Profile to Validation with TAP to SHACL

Over the past couple of years or so I have been part of the Dublin Core Application Profile Interest Group creating the DC Tabular Application Profile (DC-TAP) specification. I described DC-TAP in a post about a year ago as a “human-friendly approach that also lends itself to machine processing­”, in this post I’ll explore a little about how it lends itself to machine processing.
Continue reading

SHACL, when two wrongs make a right

I have been working with SHACL for a few months in connexion with validating RDF instance data against the requirements of application profiles. There’s a great validation tool created as part of the JoinUp Interoperability Test Bed that lets you upload your SHACL rules and a data instance and tests the latter against the former. But be aware: some errors can lead to the instance data successfully passing the tests; this isn’t an error with the tool, just a case of blind logic: the program doing what you tell it to regardless of whether that’s what you want it to do.
Continue reading

When RDF breaks records

In talking to people about modelling metadata I’ve picked up on a distinction mentioned by Staurt Sutton between entity-based modelling, typified by RDF and graphs, and record-based structures typified by XML; however, I don’t think making this distinction alone is sufficient to explain the difference, let alone why it matters.  I don’t want to get into the pros and cons of either approach here, just give a couple of examples of where something that works in a monolithic, hierarchical record falls apart when the properties and relationships for each entity are described separately and those descriptions put into a graph. These are especially relevant when people familiar with XML or JSON start using JSON-LD. One of the great things about JSON-LD is that you can use instance data as if it were JSON, without really paying much regard to the “LD” part; that’s not true when designing specs because design choices that would be fine in a JSON record will not work in a linked data graph. Continue reading

Thoughts on IEEE ILR

I was invited to present as part of a panel for a meeting of the  IEEE P 1484.2 Integrated Learner Records (ILR) working group discussing issues around the “payload” of an ILR, i.e. the description of what someone has achieved. For context I followed Kerri Lemoie who presented on the work happening in the W3C VC-Ed Task Force on Modeling Educational Verifiable Credentials, which is currently the preferred approach. Here’s what I said: Continue reading

JDX: a schema for Job Data Exchange

[This rather long blog post describes a project that I have been involved with through consultancy with the U.S. Chamber of Commerce Foundation.  Writing this post was funded through that consultancy.]

The U.S. Chamber of Commerce Foundation has recently proposed a modernized schema for job postings based on the work of HR Open and Schema.org, the Job Data Exchange (JDX) JobSchema+. It is hoped JDX JobSchema+ will not just facilitate the exchange of data relevant to jobs, but will do so in a way that helps bridge the various other standards used by relevant systems.  The aim of JDX is to improve the usefulness of job data including signalling around jobs, addressing such questions as: what jobs are available in which geographic areas? What are the requirements for working in these jobs? What are the rewards? What are the career paths? This information needs to be communicated not just between employers and their recruitment partners and to potential job applicants, but also to education and training providers, so that they can create learning opportunities that provide their students with skills that are valuable in their future careers. Job seekers empowered with greater quantity and quality of job data through job postings may secure better-fitting employment faster and for longer duration due to improved matching. Preventing wasted time and hardship may be particularly impactful for populations whose job searches are less well-resourced and those for whom limited flexibility increases their dependence on job details which are often missing, such as schedule, exact location, and security clearance requirement. These are among the properties that JDX provides employers the opportunity to include for easy and quick identification by all.  In short, the data should be available to anyone involved in the talent pipeline. This broad scope poses a problem that JDX also seeks to address: different systems within the talent pipeline data ecosystem use different data standards so how can we ensure that the signalling is intelligible across the whole ecosystem?

Continue reading