I’ve put ‘final’ in inverted commas in the title of this blog post (which should be sung—of course—to the tune of this song) – because while the JISC-funded Jerome project has indeed come to an end, Jerome itself is going nowhere. We’ll continue to tweak and develop it as an “un-project” (from whence it came), and—we sincerely hope—Jerome will lead in time, in whole or in part, to a real, live university library service of awesome.
Before we get started, though, thanks are due to the whole of the Jerome project team: Chris Leach, Dave Raines, Tim Simmonds, Elif Varol, Joss Winn, developers Nick Jackson and Alex Bilbie times a million, and also to people outside the University of Lincoln who have offered support and advice, including Ed Chamberlain, Owen Stephens, and our JISC programme manager, Andy McGregor.
Just what exactly have we produced?
- A public-facing search portal service available at: http://jerome.library.lincoln.ac.uk/
- Featuring search, browse, and bibliographic record views.
- Search is provided by Sphinx.
- A ‘mixing desk‘ allows user control over advanced search parameters.
- Each record is augmented by data from OpenLibrary (licenced under CC0) to help boost the depth and accuracy of our own catalogue. Where possible, OpenLibrary also provides our book cover images.
- Bibliographic work pages sport COinS metadata and links to previews from Google Books.
- Item data is harvested from the Library Management System.
- Social tools allow sharing of works on Facebook, Twitter, etc.
- Openly licensed bibliographic data, available at http://data.lincoln.ac.uk/documentation.html#bib, and including:
- 170,000 library catalogue records, released under a CC0 licence
- 3,100 repository records (CC0)
- 92,000 e-journal records (CC0)
- XXX,XXX (we still don’t know yet! It hasn’t finished counting…) journal tables of contents derived from JournalTOCs, available under CC-BY
- See our licensing page for more information.
- Attractive, documented, supported APIs for all data, with a timeline of data refresh cycles. The APIs will provide data in the following formats:
- Source code for Jerome will be made Open and publicly available (after a shakedown) on GitHub.
- While the user interface, technical infrastructure, analytics and machine learning/personalisation aspects of Jerome have been discussed fairly heavily on the project blog, you’ll have to wait a little while for formal case studies.
- Contributions to community events. We presented/discussed Jerome at:
- JISC Infrastructure for Resource Discovery start-up meeting, Birmingham, March
- M25 Consortium CPD event on mashups and open data, London, April
- JISC/RLUK ‘Opening Data – Opening Doors’ event, Manchester, April (report)
- CILIP UC&R Yorkshire & Humberside training event, Huddersfield, May
- JISC/RLUK Discovery launch event, London, May (report)
- Post-project Jerome/COMET open-data mashup day, Cambridge, August!
- We are also submitting a magazine article to SCONUL Focus about Jerome in the context of next-generation library resource discovery (“OPAC2.0″) services, to be published later in 2011.
What ought to be done next?
- There’s a lot more interesting work to be done around the use of activity/recommendation data and Jerome. We’re using the historical library loan data both to provide user recommendations (“People who borrowed X…“), and to inform the search and ranking algorithms of Jerome itself. However, there are lots of other measures of implicit and explicit activity (e.g. use of the social sharing tools) that could be used to provide even more accurate recommendations.
- Jerome has concentrated on data at the bibliographic/work level. But there is potentially even more value to be had out of aggregating and querying library item data (i.e. information about a library’s physical and electronic holdings of individual copies of a bibliographic work) – e.g. using geo-lookup services to highlight the nearest available copies of a work. This is perhaps the next great untapped sphere for the open data/Discovery movement.
- Demonstrate use of the APIs to do cool stuff! Mashing up library data with other sets of institutional data (user profiles, mapping, calendaring data) to provide a really useful ‘portal’ experience for users. Also: tapping into Jerome for reporting/administrative purposes; for example identifying and sanitising bad data!
Has Jerome’s data actually been used?
Probably not yet. We were delighted to be able to offer something up (in the form of an early, bare-bones Jerome bibliographic API) to the #discodev Developer Competition, where we still hope to see it used. Also, we are holding a post-project hack day (on 8 August 2011) with the COMET project in Cambridge to share data, code, and best practices around handling Open Data. We certainly intend to make use of the APIs internally to enhance the University of Lincoln’s own library services. If you’re interested in making use of the Jerome open data, please email me or leave a comment here.
What skills did we need?
At the University of Lincoln we have been experimenting with a new (for us) way of managing development projects: the Agile method, using shared tools (Pivotal Tracker, GitHub) to allow a distributed team of developers and interested parties to work together. On a practical level, we’ve had to come to terms with matching a schemaless database architecture with traditional formats for describing resources… Nick and Alex have learned more about library standards and cataloguing practice (*cough*MARC*cough*) than they may have wished! . There are also now plans to extend MongoDB training to more staff within the ICT Services department.
What did we learn along the way?
Three things to take away from Jerome:
- MARC is evil. But still, perhaps, a necessary evil. Until there’s a critical mass of libraries and library applications using newer, more sane languages to describe their collections, developers will just have to bite down hard and learn to parse MARC records. Librarians, in turn, need to accept the limitations of MARC and actively engage in developing the alternative</lecture over>.
- Don’t battle: use technology to find a way around licensing issues. Rather than spending time negotiating with third parties to release their data openly, Jerome took a different approach, which was to release openly those (sometimes minimal) bits of data which we know are free from third-party interest, then to use existing open data sources to enhance and extend those records.
- Don’t waste time trying to handle every nuance of a record. Whilst it’s important from a catalogue standpoint, people really don’t care if it’s a main title, subtitle, spine title or any other form of title when they’re searching. Perfection is a goal, but not a restriction. Releasing 40% of data and working on the other 60% later is better than aiming for 100% and never releasing anything.
Thanks! It’s been fun…