Archive for the ‘JISC’ Category

It’s the end of Jerome as we know it (but I feel fine)

Posted on November 28th, 2011 by Paul Stainthorp

Notice: Undefined index: ga_export_settings in /var/www/html/wp-content/plugins/google-analytics-async/google-analytics-async.php on line 403

The University of Lincoln’s Jerome project finished in August with the successful release of more than 240,000 openly-licensed bibliographic records, available over developer APIs, and a joint hack day with Cambridge University Library‘s COMET project.

Now, encouraged by positive JISC feedback, both institutions—Cambridge and Lincoln jointly—have applied for follow-up project funding under the project title CLOCK. If our bid is successful, the new project will run between December 2011–July 2012, employing a web developer based at the University of Lincoln, and distilling the work of both institutions into the development of new innovative library metadata discovery services for the scholarly community.

You can read the project proposal for CLOCK at http://lncn.eu/ijt4 – the introductory section is below.

The University of Lincoln and Cambridge University Library both delivered successful projects (Jerome and COMET) for the JISC Infrastructure for Resource Discovery Programme in 2011. This is a proposal for the continuation of and elaboration upon the work of both projects, via a programme of development work shared between the two institutions.

Throughout both projects (COMET-Jerome), parallel approaches in technology and data structure were noted and commented upon. A ‘mash day’ workshop event held in Cambridge in August aimed to explore these differences as well as areas of potential synergy. Here project members identified several points of interest to take forward.

Both projects produced outputs of interest to researchers, students, librarians, developers, and designers of bibliographic discovery environments. The CLOCK project will harness the success of these two complementary initiatives and investigate new approaches to data creation and discovery in the library domain. In particular, it will investigate, propose, and develop new, web-based bibliographic tools/APIs which will make it easier for developers, academic libraries and library end-users (esp. researchers) to find Open Bibliographic Data and incorporate that data into systems and workflows.

This project is an opportunity to [1] exploit through real-world applications the significant amount of data released openly by Cambridge University Library; [2] apply the Jerome database architecture, iterative development methodology, and API framework to a bibliographic dataset an order of magnitude greater than the University of Lincoln’s; and [3] to build and enable a new set of tools and demonstrator services which will enable the future development of public Open Bib Data web applications of practical utility to libraries and end-users.

The project will be supported by library consultant Owen Stephens, who will help to put the work into a national context, relating CLOCK to the wider movement toward Open Bib Data and the work of the JISC Discovery initiative. It will take place in an environment (Lincoln/Cambridge) where a culture of developer inquiry and experimentation is encouraged and nurtured. It is also endorsed by senior library management at both universities.

Both universities are involved in complementary development work which will  both inform and be informed by CLOCK: at Cambridge, Ed Chamberlain is guiding the development of the JISC Open Bibliography 2 project; in Lincoln, Paul Stainthorp is lead researcher on the #jiscmrd Orbital project, which is investigating the management of research data, with some areas of overlap.

CLOCK will operate as part of the wider JISC Digital Infrastructure: Information and library infrastructure: Resource discovery, and support the recent concerted effort to move toward openly licensed library discovery in UK Higher Education and beyond.

In the background at Discovery event

Posted on May 26th, 2011 by Paul Stainthorp

Notice: Undefined index: ga_export_settings in /var/www/html/wp-content/plugins/google-analytics-async/google-analytics-async.php on line 403

A few of the Jerome project team are at the JISC/RLUK event in London: ‘Discovery – building a UK metadata ecology‘. Our slides are running on a screen in the foyer; I’ll be hanging around to talk about them.

An elastic bucket down the data well (#rdtf in Manchester)

Posted on April 20th, 2011 by Paul Stainthorp

I was in Manchester on Monday for Opening Data – Opening Doors, a one-day “advocacy workshop” hosted by JISC and RLUK under their Resource Discovery Taskforce (#rdtf) programme. I delivered a five-minute ‘personal pitch’ about Jerome, open data, and the rapid-development ethos that’s developing at Lincoln.

Ken Chad is writing up a report from the day and Helen Harrop is producing a blog, both of which will be signposted from the website: http://rdtf.mimas.ac.uk/

The big data question

All the presentations can be viewed on slideshare, but there were some particular moments that I think are worth picking out:

The JISC deputy, Prof. David Baker was first up. His presentation, ‘A Vision for Resource Discovery‘ should be compulsory reading for university librarians. See, in particular, slides #6 (guiding principles of the RDTF), #8 (a future state of the art by 2012), and #11 (key themes).

Slide from David Baker's presentation Slide from David Baker's presentation Slide from David Baker's presentation

Following this introduction, there were three ‘perspectives’, short presentations “reflecting on the real world motivations and efforts involved in opening up bibliographic, archival and museums data to the wider world”: from the National Maritime Museum, the National Archives

…and from Ed Chamberlain of (Jerome’s ‘sister project‘) COMET (Cambridge Open METadata), the perspective from Cambridge University Library on opening up access to their non-inconsiderable bibliographic data. N.B. slides #4 (what does COMET entail?), #9 (licensing) and—more than anything else—slide #16 (“beyond bibliography”).

Slide from Ed Chamberlain's presentation Slide from Ed Chamberlain's presentation Slide from Ed Chamberlain's presentation

The first breakout/discussion session which I sat in on looked at technical and licencing constraints to opening up access to [bib] data. This was the point at which the tortured business metaphors started to pile up. ‘Buckets’ of data. ‘Elastic’ buckets that can expand to include any kind of data. And (my personal contribution, continuing the wet theme): data often exist at the bottom of a ‘well’. Just because a well is open at the top, it doesn’t necessarily make it easy to get the water out! You need another kind of bucket – a service bucket that makes it possible to extract and make use of the water. Sorry, data. What were we talking about again?

Then a series of 5-minute ‘personal pitches’, including mine just after lunch. I didn’t use slides, but I’m typing up my handwritten notes on Google Docs and I’ll post them as a separate blog post when I get a chance.

David Kay (SERO), Paul Miller (Cloud of Data) and Owen Stephens delivered the meat of the afternoon session in their presentation, ‘The Open Bibliographic Data Guide – Preparing to eat the elephant‘. The website containing the Open Bib Data Guide (which has not been formally launched until now) can be found at: http://obd.jisc.ac.uk/

The site itself is going to be invaluable in hand-holding and guiding institutions through the possibilities in opening up access to their own bibliographic data (OBD). Slides from the presentation that are particularly worth noting are #8 (which shows the colour-coding used to distinguish the different OBD use-cases) and #14 (examples of existing OBD).

Slide from the OBD presentation Slide from the OBD presentation

Paul Walk’s presentation, ‘Technical standards & the RDTF Vision: some considerations‘, is the source of the slide which I photographed (at the top of this blog post). Paul talked about ‘safe bets’; aspects of the Web that we can rely on playing a part in allowing us to create a distributed environment for resource discovery: including “ROASOADOA” (Resource- / Service- / Data-Oriented Architecture), persistent identifiers, and a RESTful approach. See also this blog post.

In the second breakout/discussion session, we discussed technical approaches. One of the themes which we kept coming back to was that of two approaches (encapsulated by Paul’s slide) which—while not mutually exclusive—may require different business cases or different explanations in order to be taken up by institutions. We characterised the two approaches as:

  • Raw open data vs Data services
  • Triple store vs RESTful APIs
  • Jerome vs COMET (bit of a caricature, this one, but not entirely unjustified!)

I was gratified that Lincoln’s approach to rapid development and provision of open services was also referred to in non-ungratifying terms, as a model which could be valuable for the HE sector as a whole.

Finally, we heard what’s next for the #rdtf programme. It’s going to be rebranded as ‘Discovery‘ and formally re-launched under the new name at another event: ‘Discovery – building a UK metadata ecology‘ on Thursday, 26 May 2011, in London. See you there?

Ken Chad is writing up a report from the day and Helen Harrop is producing a blog, both of which will be signposted from the website: http://rdtf.mimas.ac.uk.

Risk Analysis and Success Plan

Posted on March 25th, 2011 by Paul Stainthorp

The long-term risks of not developing and innovating in our library services (loss of relevance for the library’s services; student, teacher and researcher dissatisfaction; the inability to further innovate because of the ‘chilling’ effect of out-of-date technology) outweigh any risks internal to the project. Through work done so far, we are confident that we have sufficient skills and experience among team members to undertake each of the deliverables. The project has support from the most senior level of university management.

Our main concern is around the licensing of our data, some of which is supplied by third-parties.1  We will ensure that mechanisms are put in place during the development of our APIs, that ensure we conform to any licensing agreements and have a sufficient body of data that we own to make the project worthwhile. IPR issues are further addressed below.

As always, there is a minor risk that team members may be absent during the project due to illness, but this will be mitigated by close collaboration on work packages and sharing of responsibilities.

We have worked on Jerome, experimentally, for the last four months and have resolved many of the initial questions that might arise in a project like this. From the point of view of our ICT systems, many of the technological and related cultural changes (i.e. the use of No-SQL rather than relational databases) are being worked through and positively demonstrated in our work on Total ReCal.

We hope to achieve a number of objectives through the deliverables of this project. If we find we have been over ambitious, we will prioritise the release of bibliographic data and the development of sustainable, supported APIs. The public-facing search portal and personalisation engine can be completed post-project, based on the achievements of our other deliverables.

In line with the RDTF report, the Jerome project has recognised that “change is vital if library catalogues are to retain relevance and visibility in the wider networked discovery environment.” Similarly, we also understand there is a business case for making our library services data available in open and standardised ways and will derive a number of indicative use-cases from those provided in the Open Bibliographic Data Guide to inform our business case. 

The proposed project receives the full support of the University Librarian, and concords with the Library’s strategic aim to develop innovative and student-centred services, responsible to new ways of thinking and doing, and which support the Student as Producer agenda.

The proposal is born out of lessons learned from the Learning Landscape project, Total ReCal, JISCPress and our investigative work on Jerome so far. We are committed to developing and improving our virtual research, teaching and learning environment and see our work on Jerome as fundamental and integral to this commitment.

We will engage in a number of ways to communicate with stakeholders throughout the project (blog, Twitter, conferences, case studies) so as to ensure that our work is widely known, understood and supported.

We will demonstrate the value of innovation around library data to all stakeholders, in terms of how they express our virtual Learning Landscape, improve the effectiveness of communication across the university and contribute to a more efficient use (and re-use) of data.

The University of Lincoln can demonstrate that past JISC-funded projects have led to sustained services that continue to benefit our staff, students and the wider community.

  1. Including MARC records sourcd from national libraries via Z39.50 services (though not directly from OCLC); e-book MARC records from commercial vendors; journal TOCs via RSS. []