Tag Archives: semantic technologies

What am I doing here? 3. Tabular Application Profiles

The third thing (of four) that I want to include in my review of what projects I am working on is the Dublin Core standard for Tabular Application Profiles. It’s another of my favourites, a volunteer effort under the DCMI Application Profiles Working Group.

Application profiles are a powerful concept, central to how RDF can be used to create new data models without creating entirely new standards. The DCTAP draft says this about them:

An application profile defines metadata usage for a specific application. Profiles are often created as texts that are intended for a human audience. These texts generally employ tables to list the elements of the profile and related rules for metadata creation and validation. Such a document is particularly useful in helping a community reach agreement on its needs and desired solutions. To be usable for a specific function these decisions then need to be translated to computer code, which may not be a straightforward task.

About the work

We have defined an (extensible) set of twelve elements that can be used as headers in a table defining an application profile, a primer showing how they can be used, and along the way we wrote a framework for talking about metadata and application profiles. We have also worked on various implementations and are set to create a cookbook showing how DC TAP can be used in real world applications. The primer is the best starting point for understanding the output as a whole.

The framework for talking about metadata came about because we were struggling to be clear when we used terms like property or entity. Does “property” refer to something in the application profile or in the base standard or in as used in some metadata instance or does it refer to a property of some thing in the world? In short we decided that the things being described have characteristics and relationship to each other which are asserted in RDF metadata using statements that have a predicates in them, those predicates reference properties that are part of a pre-defined vocabulary, and an application profile defines templates for how the property is used in statements to create descriptions. There is a similar string of suggestions for talking about entities, classes and shapes as well as some comments on what we found too confusing and so avoid talking about. With a little care you can use terms that are both familiar in context and not ambiguous.

About my role

This really is a team effort, expertly lead by Karen Coyle, and I just try to help. I will put my hand up as the literal minded pedant who needed a framework to make sure we all understood each other. Otherwise I have been treating this a side project that gives me an excuse to do some python programming: I have documented my TAP2SHACL and related scripts on this blog, which focus on taking a DCMI TAP and expressing it as SHACL that can be used to validate data instances. I have been using this on some other projects that I am involved in, notably the work with PESC looking at how they might move to JSON-LD.

What am I doing here? 2. Open Competencies Network

I am continuing my January review of the projects that I am working on with this post about my work on the Open Competencies Network (OCN). OCN is a part of the T3 Network of Networks, which is an initiative of US Chamber of Commerce Foundation aiming to explore “emerging technologies and standards in the talent marketplace to create more equitable and effective learning and career pathways.” Not surprisingly the Open Competencies Network (OCN) focuses on Competencies, but we understand that term broadly, including any “assertions of academic, professional, occupational, vocational and life goals, outcomes … for example knowledge, skills and abilities, capabilities, habits of mind, or habits of practice” (see the OCN competency explainer for more). I see competencies understood in this way as the link between my interests in learning, education, credentials and the world of employment and other activities. This builds on previous projects around Talent Marketplace Signalling, which I also did for the US Chamber of Commerce Foundation.

About the work

The OCN has two working groups: Advancing Open Competencies (AOC), which deals with outreach, community building, policy and governance issues, and the Technical Advisory Workgroup. My focus is on the latter. We have a couple of major technical projects, the Competency Explorer and the Data Ecosystem Standards Mapping (DESM) Tool, both of which probably deserve their own post at some time, but in brief:

Competency Explorer aims to make competency frameworks readily available to humans and machines by developing a membership trust network of open registries each holding one or more competency frameworks and enabling search and retrieval of those frameworks and their competencies from any registry node in the network.

DESM was developed to support data standards organizations—and the platforms and products that use those standards—in mapping, aligning and harmonizing data standards to promote data interoperability for the talent marketplace (and beyond). The DESM allows for data to move from a system or product using one data standards to another system or product that uses a different data standard.

Both of these projects deal with heterogeneous metadata, working around the theme of interoperability between metadata standards.

About my role

My friend and former colleague Shiela once described our work as “going to meetings and typing things”, which pretty much sums up the OCN work. The purpose is to contribute to the development of the projects, both of which were initiated by Stuart Sutton, whose shoes I am trying to fill in OCN.

For the Competency Explorer I have helped turn community gathered use cases into  features that can implemented to enhance the Explorer, and am currently one of the leads of an agile feature-driven development project with software developers at Learning Tapestry to implement as many of these features as possible and figure out what it would take to implement the others. I’m also working with data providers and Learning Tapestry to develop technical support around providing data for the Competency Explorer.

For DESM I helped develop the internal data schema used to represent the mapping between data standards, and am currently helping to support people who are using the tool to map a variety of standards in a pilot, or closed beta-testing. This has been a fascinating exercise in seeing a project through from a data model on paper, through working with programmers implementing it, to working with people as they try to use the tool developed from it.

DCAT AP DC TAP: a grown up example of TAP to SHACL

I’ve described a couple of short “toy” examples as proof of concept of turning a Dublin Core Application Profile (DC TAP) into SHACL in order to validate instance data: the SHACL Person Example and a Simple Book Example; now it is time to see how the approach fares against a real world example. I chose the EU joinup Data Catalog Application Profile (DCAT AP) because Karen Coyle had an interest in DCAT, it is well documented (pdf) with a github repo that has SHACL files, there is a Interoperability Test Bed validator for it (albeit a version late) and I found a few test instances with known errors (again a little dated). I also found the acronym soup of DCAT AP DC TAP irresistable.
Continue reading

TAP to SHACL example

Last week I posted Application Profile to Validation with TAP to SHACL about converting a DCMI Tabular Application Profile (DC TAP) to SHACL in order to validate instance data. I ended by saying that I needed more examples in order to test that it worked: that is, not only check that the SHACL is valid, but also that validates / raises errors as expected when used with instance data.
Continue reading

Application Profile to Validation with TAP to SHACL

Over the past couple of years or so I have been part of the Dublin Core Application Profile Interest Group creating the DC Tabular Application Profile (DC-TAP) specification. I described DC-TAP in a post about a year ago as a “human-friendly approach that also lends itself to machine processing­”, in this post I’ll explore a little about how it lends itself to machine processing.
Continue reading

SHACL, when two wrongs make a right

I have been working with SHACL for a few months in connexion with validating RDF instance data against the requirements of application profiles. There’s a great validation tool created as part of the JoinUp Interoperability Test Bed that lets you upload your SHACL rules and a data instance and tests the latter against the former. But be aware: some errors can lead to the instance data successfully passing the tests; this isn’t an error with the tool, just a case of blind logic: the program doing what you tell it to regardless of whether that’s what you want it to do.
Continue reading

When RDF breaks records

In talking to people about modelling metadata I’ve picked up on a distinction mentioned by Staurt Sutton between entity-based modelling, typified by RDF and graphs, and record-based structures typified by XML; however, I don’t think making this distinction alone is sufficient to explain the difference, let alone why it matters.  I don’t want to get into the pros and cons of either approach here, just give a couple of examples of where something that works in a monolithic, hierarchical record falls apart when the properties and relationships for each entity are described separately and those descriptions put into a graph. These are especially relevant when people familiar with XML or JSON start using JSON-LD. One of the great things about JSON-LD is that you can use instance data as if it were JSON, without really paying much regard to the “LD” part; that’s not true when designing specs because design choices that would be fine in a JSON record will not work in a linked data graph. Continue reading

JDX: a schema for Job Data Exchange

[This rather long blog post describes a project that I have been involved with through consultancy with the U.S. Chamber of Commerce Foundation.  Writing this post was funded through that consultancy.]

The U.S. Chamber of Commerce Foundation has recently proposed a modernized schema for job postings based on the work of HR Open and Schema.org, the Job Data Exchange (JDX) JobSchema+. It is hoped JDX JobSchema+ will not just facilitate the exchange of data relevant to jobs, but will do so in a way that helps bridge the various other standards used by relevant systems.  The aim of JDX is to improve the usefulness of job data including signalling around jobs, addressing such questions as: what jobs are available in which geographic areas? What are the requirements for working in these jobs? What are the rewards? What are the career paths? This information needs to be communicated not just between employers and their recruitment partners and to potential job applicants, but also to education and training providers, so that they can create learning opportunities that provide their students with skills that are valuable in their future careers. Job seekers empowered with greater quantity and quality of job data through job postings may secure better-fitting employment faster and for longer duration due to improved matching. Preventing wasted time and hardship may be particularly impactful for populations whose job searches are less well-resourced and those for whom limited flexibility increases their dependence on job details which are often missing, such as schedule, exact location, and security clearance requirement. These are among the properties that JDX provides employers the opportunity to include for easy and quick identification by all.  In short, the data should be available to anyone involved in the talent pipeline. This broad scope poses a problem that JDX also seeks to address: different systems within the talent pipeline data ecosystem use different data standards so how can we ensure that the signalling is intelligible across the whole ecosystem?

Continue reading