I’ve been experimenting with ways of putting JSON-LD schema.org metadata into HTML created by MkDocs. The result is a python-markdown plugin that will (hopefully) find blocks of YAML in markdown and insert then into the HTML that is generated. You can find the plugin on github, and you can read more about the development of it in some pages generated by MkDocs (that incidentally use the plugin).
What’s it do?
Markdown is a widely used simple text format that allows formatting of text using inline markup, it’s a bit like the markup used on mediawiki/wikpedia. MkDocs is a python program that will build HTML pages out of markdown and templates. It’s geared towards the production of software/spec documentation, and we have been using it for documenting the metadata spec we’re creating for educational materials in the K12OCX project. (You’ll see the OCX part, Open Content Exchange, made it through to the plugin name.) Steve Midgley suggested that we might go further and use markdown to create the learning resources, somehow generating the metadata along with the HTML. MkDocs can be extended in a number of ways that would facilitate this, but most relevantly it uses the python markdown module, which has an API allowing for extensions.
YAML seemed like the obvious way of putting metadata into markdown. It’s another simple text format, for expressing key-value pairs, where the values can be lists or sets of other key-value pairs. It’s already used by MkDocs for specifying the site structure, and a number of extensions to python markdown already use it.
So the ocxmd extension for python markdown will look for blocks of YAML in a markdown document and replace them with blocks of JSON-LD in the HTML that is generated. It also provides the metadata as python dict in the markdown object. Feel free to try it out (cautiously) and let me know how it goes wrong / what it doesn’t do that it should / what it does that it shouldn’t …
How’s it work?
Quite simple really. I took inspiration from an existing YAML in markdown processor written by Nikita Sivakov (who’s probably a better programmer than me, so if his plug does what you want, use it, not mine). The YAML is separated before and after by a triple dash (‘---
‘) on a line by itself. The plugin extends the python markdown Preprocessor so that it goes through the mark down document line-by-line looking for a triple dash. Until it finds one the text is copied into what will be returned as the ‘pure’ markdown document for further processing. When it finds a triple dash, it instead copies the lines into what will be processed as YAML (along with a few lines that will become the JSON-LD context). When it finds the closing triple dash, it processes the YAML using a python library, and then copies it into the markdown to be returned as JSON-LD between a couple of <script> tags. It also stores the python dict generated by the YAML processor as a new property in the markdown object. Then it goes back to reading line-by-line copying the text straight into what will be returned as markdown until it meets the end of file or the next triple dash.
If installed in a suitable python environment you can add it to the extensions available to MkDocs with an entry to the mkdocs.yml markdown_extensions block.
What does JSON-LD in YAML look like?
The metadata that we use for OCX is a profile of schema.org / LRMI, OERSchema and few bits that we have added because we couldn’t find them elsewhere. Here’s what (mostly) schema.org metadata looks like in YAML:
"@context": - "http://schema.org" - "oer": "http://oerschema.org/" - "ocx": "https://github.com/K12OCX/k12ocx-specs/" "@id": "#Lesson1" "@type": - oer:Lesson - CreativeWork learningResourceType: LessonPlan hasPart: "@id": "#activity1-1" author: "@type": Person name: Phil Barker
One thought on “Metadata for markdown / MkDocs”
Comments are closed.