George Orwell is blogging, so is Samuel Pepys. And quite aside from the content (I’m an Orwell fan, the merits of this content was discussed when the blog was launched here, and here), I think this is brilliant way of putting diaries online as open content. Delivery, at least, relies on software anyone can use for free; you and I can get the text in a machine readable format, HTML and RSS; each entry gets a URI; the entries can be tagged and commented on; locations can be mapped on Google (Orwell, Pepys), other concepts mentioned linked to encyclopaedia entries; the blog owners could, at least in principle, export the whole lot in XML and stick it in a database to process, and anyone can process entries with text mining software or by setting up a Google custom search engine or . . . .
The two examples above are slightly short of perfect. I like to see the dates for the blog entries matching the dates for the diary entries (the Pepys diaries do this, Orwell managed it at first, but then slipped). And I think it would make more sense if the monthly archives were arranged to be read top-to-bottom in chronological order. Also I wonder if hosting on wordpress.com is the best idea. It has its attractions, but the tags in the Orwell blog link to posts from other blogs which are well out of scope while the Pepys diary has some very interesting customizations; also if the Orwell blog owners do ever find a way to go back to posting against the diary entry date I imagine they would have problems setting up redirects so that links to the current posts still work.
1. I guess I should be clear: I’m not saying that these diaries are open content. The Pepys text is from Project Gutenberg, I don’t know the licensing arrangements for other aspects of the blog; Orwell’s text is still copyright in many countries (including the UK and the US), I don’t the licensing arrangements for the blog.
2. The Orwell diary is on WordPress.com; Pepys uses a customized installation of Moveable Type.