It’s been a while since anyone posted about what’s been going on with Jerome, mostly because those pesky students keep taking up valuable messing about time with fiddling little problems like being unable to log in. Okay, I jest. We love students really, since their complaining drives so many of the things we want to do.
First of all, epic backend work has been going on to make our Horizon to Jerome import path a bit slicker. Through a bit of inspiration from Dave Pattern, some XML voodoo, some juggling of arrays, a clever scheduled task and a plain text file I’ve been able to get Horizon imports happening on a rolling basis. It takes us a little under 7 and a half days (7.41 if you really care) to complete a full cycle of imports, iterating through every potential record number in our catalogue to find out if there’s anything useful. It’s not the most efficient method (we’ll build in some smart blank record skipping in a future version), but it does stop us from melting the server with a massive bulk export. At the moment this is throttled back to around half of its theoretical maximum rate whilst we test it, but by the end of the month we’re hoping to have import cycles running at under 5 days, and under 4 by Christmas.
We’ve also started work on our Journals indexing. This is a bit more tricky due to the lack of open information for a lot of journals, but by tapping in to Journal TOCs we can get hold of a fair bit of journal information and table of contents, allowing article-level searching of all our available resources. Similarly to catalogue import this is a rolling import process, so things may not appear for a day or two.
For all resources (catalogue and journals) we’re taking a look at what open data we can grab from elsewhere on the internet to bolster search results. We’d really like to be able to grab summaries, abstracts and synopses wherever possible (it’s something else to search through) but there are a few licensing issues we need to look at in more detail. Regardless, however, we will soon be running all our available search content through term extractors to automatically generate keywords
Finally on the backend, I’ve made some sweeping changes to how search reindexes content (it’s now fully automated and bitching fast), and some tweaks to our search API to support weighting data (so your results really are more relevant, and we don’t give things like stemmed words and metaphones the same search priority as your original text) and our upcoming relevancy engine (more below).
On the front-end side Alex has been making a variety of tweaks to our geolocation service, which can (intelligently using a rather large ruleset) determine exactly which of our libraries you’re most likely to be interested in, meaning that the people in Hull aren’t bombarded with information about the GCW.
We’re also pleased to announce that our iPad app for roving library staff is coming along nicely, and we’ve happily got it talking to the backend APIs for item retrieval. There are still quite a few things we need to fix on the UI side and it needs all the screws tightening so that it doesn’t monumentally break if things go wrong. Keep an eye out for Library staff members walking around with iPads sometime in the next few months.
Paul (our inside man) has been taking a look at exactly how best to present what we’re doing with Jerome to staff and students, avoiding connotations of “we built this to replace bits of the current system” (which we didn’t) and instead get the message across that we’re really aiming blue-sky, seeing just how awesome we can make individual bits (which we did). This comes with a lot of stuff about how things that we’re working on might actually fit into the Library in the future, which bits just won’t work and how some things would work better. If someone in a trendy orange t-shirt accosts you and asks a couple of questions, try be helpful since it’s your library we’re working on here.
Finally, our big new announcement for the next Really Cool And Epically Awesome bit of Jerome: the somewhat boringly named Relevancy Engine. This is something we’ve been toying with the notion of for a while, but we’ve finally worked out how to do it and how it fits into the big plan. In short, it will do its best to make sure that what you get at the top of your search results is exactly what you’re looking for. It takes variables such as the books you’ve borrowed in the past, how long they’ve been out for, which course you’re doing, what year you’re in, borrowing habits of others on your course, past borrowing trends, your physical location, how many books you currently have out, the time of day and even the weather (who wants to walk to the library when it’s raining?) and uses them to subtly adjust which resources we present to you at any given moment. If the library is closed, ebooks will drift up your search results. Everybody on your course borrowing a specific book? It’s a fair bet that’s what you want, even if there are more specific title matches for your search. Postgraduate student? You’re probably more interested in journals than a fresher. These variables wil all be taken into account along with our search weighting (how ‘close’ a given item is to what you searched for ) when we work out the search rankings.
The whole relevancy engine will be built on top of a machine learning (or, if you insist, AI) system so all the above situations and outcomes are entirely hypothetical, but it does mean that the relevancy weightings are based on how people really behave. Over time it will get better, eventually being smart enough to recommend books based on your current situation.
On top of that, we can give you real-time sliders which let you tweak your own relevancies. Prefer ebooks? Drag the slider towards the ‘electronic’ side and watch as the physical books slide down the list. More journals, with a focus on biomechanics? Turn up the preference for journals and enter ‘biomechanics’ as a prioritised search term and we’ll do our best to match what you’ve told us you’re looking for. We’ll even remember your preferences between visits (as long as you’re logged in), and they’ll even affect your searches on different devices (so your mobile is in harmony with your library searching zen as well).
Obviously there’s a lot more going on (including some cool stuff around library usability in general, such as text-based room bookings), but I hope this has given you a bit of an idea what we’ve been working on.

[...] This post was mentioned on Twitter by Paul Stainthorp and Jeanette Castle, Gary. Gary said: RT @pstainthorp: Nice #Jerome blog post by @jacksonj04 – will try and make sure Library colleagues read it! http://lncn.eu/w35 [...]