Posts Tagged ‘API’

Is that a Jerome open data API I spy?

Posted on June 28th, 2011 by Paul Stainthorp

Yes. Yes, it is.

http://data.online.lincoln.ac.uk/documentation.html#bib

This is only the initial, bare-bones JSON-only service. A complete (and fully-documented) API will be released in stages over the next month, providing data in a range of output formats. We’re keeping all API and open institutional data documentation in the one place, on our open data site.

EMALINK reimagine the OPAC

Posted on November 25th, 2010 by Paul Stainthorp

Chris Leach and I took Jerome to Loughborough University yesterday (24 November 2010), to an EMALINK seminar on next-generation OPACs. Here’s a copy of our presentation slides.

It was a particularly useful event, especially so for being packed into 2½ hours (and worth learning to drive an automatic in order to get there!), with a presentation from Loughborough about their project to select a next-generation OPAC system; group discussions around some of the factors involved in launching such services; and our own contribution, which led to some interesting conversations about the benefits and risks of experimentation in libraries.

Jerome itself passed something of a milestone this week: having finally crawled its way round the whole of Lincoln’s catalogue, it now contains a full set of our MARC records (all 214,006 of them!); each work with its own stable, persistent URL (/work/<bibnumber>). Nick Jackson has also started to play around with pulling in additional data and services from external APIs (e.g., book cover images).

Screenshot of a Jerome work record

(Yes, there’s a problem with authors being attached to the wrong records. We’re on it. In fact, Jerome will self-heal its “leaky array” problem over the course of the next week.)

Engage Ludicrous Speed!

Posted on July 23rd, 2010 by Nick Jackson

One of our key aims for Jerome is for the whole thing to be fast. Not “the average search should complete in under a second” fast, but “your application should be fine to hit us with 50 queries a second” fast.

This requirement was one of the key factors in our decision to use MongoDB as our backend database, and provide search using Sphinx. We’ll have another blog post fairly soon with more detail on how we’re using Mongo and Sphinx to store, search and retrieve data but for now I’d like to share some preliminary numbers on how close we are to our goal of speed.

First of all, getting data in. This is a pain in the backside due to the MARC-21 specification being so complex and needing to perform several repetitive checks on data to make sure we’re importing it right. However, on the import side of things we’re in the region of importing 150 MARC records a second, including parsing, filtering, mapping fields and finally getting the data into the database. This is done using the File_MARC PEAR library to manage the actual parsing of the MARC data into a set of arrays, then some custom PHP to extract information like title, author, publisher etc. into a more readily understood format. This information extraction isn’t or complete yet so it’s likely that there’ll be a bit of a slowdown as we add more translation rules, but equally it’s not optimised to improve speed.

Read the rest of this entry »

The university of linking

Posted on July 21st, 2010 by Paul Stainthorp

Today’s Talis Linked Data and Libraries open day has motivated me to make a list of some of the external data tools, web services and APIs that could well end up being sucked into Jerome’s vortex of general awesomeness.

I was inspired (possibly through drinking too much SPARQL-themed coffee) by the thought that 2010 is effectively ‘year 1′ for library-themed Linked Data. (But I promise I’ll try and keep the ‘Lincoln’/’linking’ puns to a minimum after this post…)

Library linked data cloud

“With the emergence of large, centralized sources entry to the Linked Data cloud might be easier than you think” (Ross Singer, The Linked Library Data Cloud: Stop talking and start doing, Code4Lib 2010)

So, which of these will make their way into the Jerome toolkit? (I’ll say now, before I get in trouble, that they’re not all purely Linked Data!) …compiled in part from these other lists, and by discussions/examples at the Talis event:

What have I missed? Which of these are not worth bothering with; which should we get stuck into without delay? You know where the comment form is…