Posts Tagged ‘licensing’

It’s the ‘Final’ Blog-Post

Posted on August 1st, 2011 by Paul Stainthorp

sunset cowboy

I’ve put ‘final’ in inverted commas in the title of this blog post (which should be sung—of course—to the tune of this song) – because while the JISC-funded Jerome project has indeed come to an end, Jerome itself is going nowhere. We’ll continue to tweak and develop it as an “un-project” (from whence it came), and—we sincerely hope—Jerome will lead in time, in whole or in part, to a real, live university library service of awesome.

Before we get started, though, thanks are due to the whole of the Jerome project team: Chris Leach, Dave Raines, Tim Simmonds, Elif Varol, Joss Winndevelopers Nick Jackson and Alex Bilbie times a million, and also to people outside the University of Lincoln who have offered support and advice, including Ed ChamberlainOwen Stephens, and our JISC programme manager, Andy McGregor.


Just what exactly have we produced?

  1. A public-facing search portal service available at:
    • Featuring searchbrowse, and bibliographic record views.
    • Search is provided by Sphinx.
    • A ‘mixing desk‘ allows user control over advanced search parameters.
    • Each record is augmented by data from OpenLibrary (licenced under CC0) to help boost the depth and accuracy of our own catalogue. Where possible, OpenLibrary also provides our book cover images.
    • Bibliographic work pages sport COinS metadata and links to previews from Google Books.
    • Item data is harvested from the Library Management System.
    • Social tools allow sharing of works on Facebook, Twitter, etc.
  2. Openly licensed bibliographic data, available at, and including:
  3. Attractive, documented, supported APIs for all data, with a timeline of data refresh cycles. The APIs will provide data in the following formats:
    1. RDF/XML
    2. JSON
    3. RIS
    4. The potential for MARC
  4. Source code for Jerome will be made Open and publicly available (after a shakedown) on GitHub.
  5. While the user interface, technical infrastructure, analytics and machine learning/personalisation aspects of Jerome have been discussed fairly heavily on the project blog, you’ll have to wait a little while for formal case studies.
  6. Contributions to community events. We presented/discussed Jerome at:

What ought to be done next?

  1. There’s a lot more interesting work to be done around the use of activity/recommendation data and Jerome. We’re using the historical library loan data both to provide user recommendations (“People who borrowed X…“), and to inform the search and ranking algorithms of Jerome itself. However, there are lots of other measures of implicit and explicit activity (e.g. use of the social sharing tools) that could be used to provide even more accurate recommendations.
  2. Jerome has concentrated on data at the bibliographic/work level. But there is potentially even more value to be had out of aggregating and querying library item data (i.e. information about a library’s physical and electronic holdings of individual copies of a bibliographic work) – e.g. using geo-lookup services to highlight the nearest available copies of a work. This is perhaps the next great untapped sphere for the open data/Discovery movement.
  3. Demonstrate use of the APIs to do cool stuff! Mashing up library data with other sets of institutional data (user profiles, mapping, calendaring data) to provide a really useful ‘portal’ experience for users. Also: tapping into Jerome for reporting/administrative purposes; for example identifying and sanitising bad data!

Has Jerome’s data actually been used?

Probably not yet. We were delighted to be able to offer something up (in the form of an early, bare-bones Jerome bibliographic API) to the #discodev Developer Competition, where we still hope to see it used. Also, we are holding a post-project hack day (on 8 August 2011) with the COMET project in Cambridge to share data, code, and best practices around handling Open Data. We certainly intend to make use of the APIs internally to enhance the University of Lincoln’s own library services. If you’re interested in making use of the Jerome open data, please email me or leave a comment here.

What skills did we need?

At the University of Lincoln we have been experimenting with a new (for us) way of managing development projects: the Agile method, using shared tools (Pivotal Tracker, GitHub) to allow a distributed team of developers and interested parties to work together. On a practical level, we’ve had to come to terms with matching a schemaless database architecture with traditional formats for describing resources… Nick and Alex have learned more about library standards and cataloguing practice (*cough*MARC*cough*) than they may have wished! . There are also now plans to extend MongoDB training to more staff within the ICT Services department.

What did we learn along the way?

Three things to take away from Jerome:

  1. MARC is evil. But still, perhaps, a necessary evil. Until there’s a critical mass of libraries and library applications using newer, more sane languages to describe their collections, developers will just have to bite down hard and learn to parse MARC records. Librarians, in turn, need to accept the limitations of MARC and actively engage in developing the alternative</lecture over>.
  2. Don’t battle: use technology to find a way around licensing issues. Rather than spending time negotiating with third parties to release their data openly, Jerome took a different approach, which was to release openly those (sometimes minimal) bits of data which we know are free from third-party interest, then to use existing open data sources to enhance and extend those records.
  3. Don’t waste time trying to handle every nuance of a record. Whilst it’s important from a catalogue standpoint, people really don’t care if it’s a main title, subtitle, spine title or any other form of title when they’re searching. Perfection is a goal, but not a restriction. Releasing 40% of data and working on the other 60% later is better than aiming for 100% and never releasing anything.

Thanks! It’s been fun…

Paul Stainthorp
July, 2011

Iteration Roundup

Posted on April 21st, 2011 by Nick Jackson

Another week, another iteration down. Here’s the summary for last week:

  • JournalTOCs Licensing: This seems to be a CC-BY licence, but we’re just double checking with Heriot-Watt about if this licences the API itself, or the data that comes from it.
  • Journal entries in search now sport availability dates in a nice human readable format (eg “From 1982 to now”, “From 1996 to 6 months ago”)
  • If you add Jerome to an iOS device home screen it now has a slick new icon.
  • Item pages for catalogue items include availability of current stock in most cases. We’re aware of some journals which are catalogued as books where this isn’t the case, and we’re working on it.
  • Fixed a bug where our import script was importing empty records for books which didn’t exist. Blamed Horizon Information Portal for returning pages of empty content rather than a HTTP 404.
  • Journal search results now point to an individual Jerome item page, rather than directly to our OpenURL resolver. OpenURL now lives in the bright orange “Online” box in the top right of an item page.

In other news, we’ve spent a fair bit of time starting to boost our ‘master plan’ mind map, and have moved a lot of the development points into the iterative model so that we’ll get round to them eventually. At some point next week I’ll be trying to get them into some kind of rough order out of the icebox so we can start to forecast iterations. Don’t forget that you can follow our current progress on our tracker if you’re really interested in the inner workings.


Posted on March 21st, 2011 by Paul Stainthorp

We acknowledge the issues surrounding the licensing of the data we hold but nevertheless wish to follow the recommendation made by JISC that universities “proceed on the presumption that their bibliographic data will be made freely available for use and reuse.”1 We will assess the licensing of our bibliographic data against the guidance provided by JISC.2

We are working closely with the COMET project (another JISC-funded, RDTF project), who are exploring similar issues around identifying potential licence conditions attached to (in particular) MARC records.

As regards the data produced by Jerome itself:

  • It is our intention that all bibliographic data will be made available under a Open Data Commons license.
  • All documentation will be made available under a CC-BY license.
  • Any code that can be usefully made public will be licensed under an Apache/BSD style license and we will seek advice from OSSWatch on this matter.
  1. []
  2. []