Using wikidata for linked data WordPress indexes

A while back I wrote about getting data from wikidata into a WordPress custom taxonomy. Shortly thereafter Alex Stinson said some nice things about it:


and as a result that post got a little attention.

Well, I have now a working prototype plugin which is somewhat more general purpose than my first attempt.

1.Custom Taxonomy Term Metadata from Wikidata

Here’s a video showing how you can create a custom taxonomy term with just a name and the wikidata Q identifier, and the plugin will pull down relevant wikidata for that type of entity:

[similar video on YouTube]

2. Linked data index of posts

Once this taxonomy term is used to tag a post, you can view the term’s archive page, and if you have a linked data sniffer, you will see that the metadata from WikiData is embedded in machine readable form using schema.org. Here’s a screenshot of what the OpenLink structured data sniffer sees:

Or you can view the Google structured data testing tool output for that page.

Features

  • You can create terms for custom taxonomies with just a term name (which is used as the slug for the term) and the Wikidata Q number identifier. The relevant name, description and metadata is pulled down from Wikidata.
  • Alternatively you can create a new term when you tag a post and later edit the term to add the wikidata Q number and hence the metadata.
  • The metadata retrieved from Wikidata varies to be suitable for the class of item represented by the term, e.g. birth and death details for people, date and location for events.
  • Term archive pages include the metadata from wikidata as machine readable structured data using schema.org. This includes links back to the wikidata record and other authority files (e.g. ISNI and VIAF). A system harvesting the archive page for linked data could use these to find more metadata. (These onward links put the linked in linked data and the web in semantic web.)
  • The type of relationship between the term and posts tagged with it is recorded in the schema.org structure data on the term archive page. Each custom taxonomy is for a specific type of relationship (currently about and mentions, but it would be simple to add others).
  • Short codes allow each post to list the entries from a custom taxonomy that are relevant for it using a simple text widget.
  • This is a self-contained plugin. The plugin includes default term archive page templates without the need for a custom theme. The archive page is pretty basic (based on twentysixteen theme) so you would get better results if you did use it as the basis for an addition to a custom theme.

How’s it work / where is it

It’s on github. Do not use it on a production WordPress site. It’s definitely pre-alpha, and undocumented, and I make no claims for the code to be adequate or safe. It currently lacks error trapping / exception handling, and more seriously it doesn’t sanitize some things that should be sanitized. That said, if you fancy giving it a try do let me know what doesn’t work.

It’s based around two classes: one which sets up a custom taxonomy and provides some methods for outputting terms and term metadata in HTML with suitable schema.org RDFa markup; the other handles getting the wikidata via SPARQL queries and storing this data as term metadata. Getting the wikidata via SPARQL is much improved on the way it was done in the original post I mentioned above. Other files create taxonomy instances, provide some shortcode functions for displaying taxonomy terms and provide default term archive templates.

Where’s it going

It’s not finished. I’ll see to some of the deficiencies in the coding, but also I want to get some more elegant output, e.g. single indexes / archives of terms from all taxonomies, no matter what the relationship between the post and the item that the term relates to.

There’s no reason why the source of the metadata need be Wikidata. The same approach could be with any source of metadata, or by creating the term metadata in WordPress. As such this is part of my exploration of WordPress as a semantic platform. Using taxonomies related to educational properties would be useful for any instance of WordPress being used as a repository of open educational resources, or to disseminate information about courses, or to provide metadata for PressBooks being used for open textbooks.

I also want to use it to index PressBooks such as my copy of Omniana. I think the graphs generated may be interesting ways of visualizing and processing the contents of a book for researchers.

Licenses: Wikidata is CC:0, the wikidata logo used in the featured image for this post is sourced from wikimedia and is also CC:0 but is a registered trademark of the wikimedia foundation used with permission. The plugin, as a derivative of WordPress, will be licensed as GPLv2 (the bit about NO WARRANTY is especially relevant).

One thought on “Using wikidata for linked data WordPress indexes

Comments are closed.