Activity Data Synthesis

Monday, 20 June 2011

Draft Guide: 'Identifying activity data in the library service'

[This is a draft Guide that will be published as a deliverable of the synthesis team's activities. Your comments are very much welcomed and will inform the final published version of this Guide. We are particularly interested in any additional examples you might have for the 'Additional Resources' section]

The problem:
Libraries use a range of software systems through which users interact with premises, services and resources. The LMS system is far from the only source, the OPAC and the LMS circulation module representing increasingly partial views of user attention, activity and usage in a changing world. So libraries wishing to build a picture of user interactions face the challenge of identifying the appropriate data – depending on their purpose, which may range from collection management (clearing redundant material, building ‘short loan’ capacity) to providing student success performance indicators (if correlation can be established), to developing recommender services (students who used this also used that, searched for this retrieved that, etc).
Let’s split the problem down. In this guide we consider the variety of sources available within library services, a list to which you may add more. In other guides we consider strategies for deriving intelligence from ‘anything that moves’ as well as from targeted data extraction and aggregation with reference to specific goals.

The options:
Libraries already working with activity data have identified a range of sources and purposes – Collection Management, Service Improvement, Student Success and Recommender Services. Potential uses of data will be limited where the user is not identified in the activity (‘No attribution’). Here are some key examples:

Data Source

What can be counted

Value of the intelligence


Visits to library

Service improvement, Student success


Virtual visits to library (no attribution)

Service improvement


Searches made, search terms used, full records retrieved (no attribution)

Recommender system, Student success


Books borrowed, renewed

Collection management, Recommender system, Student success

URL Resolver

Accesses to e-journal articles

Recommender system, Collection management

Counter Stats

Downloads of e-journal articles

Collection management

Reading Lists

Occurrence of books and articles – a proxy for recommendation

Recommender system

Help Desk

Queries received

Service improvement

Taking it further:
Here are some important questions to ask before you start to work with user activity data:
  • Can our systems generate that data?
  • Are we collecting it? Sometimes these facilities exist but are switched off
  • Is there enough of it to make any sense? How long have we been collecting data and how much data is collected per year?
  • Will it serve the analytical purpose we have in mind? Or could it trigger new analyses?
  • Should we combine a number of these sources to paint a fuller picture? If so, are there reliable codes held in common across the relevant systems – such as User ID?
Additional resources:
Consider also the Guides on Student Success and Data Strategies

The Library Impact Data Project (LIDP) led by the University of Huddersfield -

1 comment:

  1. To add to your data sources, you could also include Proxy server logs from systems like EZProxy as a source of activity data about e-resources. The RISE project has been looking at this data source in detail. It is certainly possible to create recommendations based on the data in the proxy server logfiles but to get the best out of the data it needs to be combined with student and bibliographic data.

    Libraries increasingly seem to have access to Customer Relelationship Management systems which are a potentially rich source of data about user requests, although a limitation here is around how easy it may be to get access to the data.