Deep Log Analysis Workplan
CIBER will undertake a set of work packages for the DLA study. These work packages are listed below and indicate the appropriate objectives being addressed, rationale for the work package, methodology, and timing and who data will be collected from. (Download as a word document). Please also see the DLA FAQs
Package 1. Sampling the logs
Timing: November – December 2007
Data Collection from:MyiLibrary and WoltersKluwer
Rationale and methodology:In order to ensure the smooth progress of log monitoring and evaluation sample logs will be obtained from the aggregators in order that they can be tested, assessed and understood.
Package 2. Benchmarking data collection
Timing: January 2008
Data Collection from: All subscribing institutions
2.1) For staff and students
Rationale:A survey of staff and students at all universities subscribed to the national e-books observatory project in order to provide contextual data for the DLA evaluation. The objectives include:
- defining an initial benchmark of the academic population in terms of their awareness and existing levels of use of e-books and the purposes for which these resources are used;
- profiling general attitudes towards formal library provision, in print or electronic formats, and overall levels of satisfaction with library services;
- obtaining a better understanding of the most effective library marketing and communication channels: what are the best ways to reach out to students and faculty regarding new e-service developments?
The questionnaire will also provide publicity, raising awareness of the JISC e-book initiative.
Methodology:Questions for the online survey will be informed by the SuperBook questionnaire and log findings. An email invitation to participate in the survey will be distributed locally within each participating institution using comprehensive all-staff and all-student (or similar) mailing lists. The body of the email will contain a link to a central survey database, held and managed at UCL, probably using SurveyMonkey software. This will allow the rapid production of simple headline reports for each institution. Response rates and possible biases will be monitored systematically so that the findings can be reported within given confidence levels and intervals. Deeper analysis of the survey responses will be undertaken using the Statistical Package for the Social Sciences. Central data collection means that bespoke findings could be generated by institution, by broad subject, and for the whole population. Sophisticated analytical techniques will be used to segment users (and non-users) by various demographic and attitudinal characteristics.
2.2) For subscribed HE libraries
Rationale:An online questionnaire will be used to provide contextual information, especially on e-books provided before the Observatory and e-books that will be provided in addition to those provided by the Observatory. It will also collect baseline information on the major technical routes provided to e-book content (e.g. is federated search implemented for e-books and, if so, how are they indexed?) and, in broad terms, what kinds of promotional engagement the library uses to reach out to its patrons.
Methodology:This survey will be much more open-ended than the one aimed at the much larger population of faculty/students and will take the form of a MS Word Forms document that can be filled out offline. The document will contain drop-down menus, some closed questions, and substantial fields for free text. The survey would be directed initially to the Head of Library and Information Services and their co-operation sought in co-ordinating the filling of the gaps. This will need the involvement of professional library staff, perhaps in their various collection development, technical and subject specialist roles. A synthesis of the findings of the survey will be presented in summary, as well as providing an essential information resource to assist in the interpretation of the subsequent DLA and qualitative phases.
Package 3. Raw server log data collection and analysis
Timing: Data collection from Jan to Dec 2008
Data Collection from: All subscribing institutions
Rationale: This package lies at the heart of the whole project and in that it provides the robust demand and use evidence base for the project. It will also provide invaluable data to assist in the promotion, creation/design and pricing of e-books and will provide the questions to drive the questionnaires, interviews and focus groups.
Methodology:Logs will be obtained from the aggregators for all universities for the whole period, and analysed at 4 monthly intervals (January-April, May-August and September-December) in order to closely monitor change/impacts and inform survey and qualitative work. These data will be processed and loaded into SPSS to obtain rich, detailed robust pictures of the information seeking behaviour of e-book users.
3.1) Processing of the data will be undertaken and key usage/activity metrics will be provided for all institutions
3.2) A representative sample of institutions will be subject to more detailed evaluation.
3.3) Data Collection from: All subscribing institutions
Methodology: The monitoring of usage/take-up/activity will employ a wide range of measurements in order to provide the comprehensive picture (no individual measurement can tell the full story). The metrics will include:
a. Number of pages (chapters viewed)
b. Number of sessions conducted
c. Site penetration (number of views per session)
d. Amount of time spent viewing a page
e. Duration of a session
f. Number of individual searches conducted in a session
g. Number or repeat visits made
h. Whether publisher’s print facility was used (a possible impact/outcome metric)
CIBER has established from the SuperBook study that the e-book logs provide valuable information on the following information seeking characteristics:
1. Reach – e-book titles used and not used
2. Amount of usage per title and scatter of use
3. Number of e-books used
4. Subject of book used
5. Price of the book used (only available for some publishers)
6. Length of book in terms of pages (only available by some publishers)
7. Whether it was a catalogued book that was used
DLA is not just about mining the data deeply it is also very much about transforming log data into user data. In this regard the following analyses will be undertaken for all the subscribed institutions:
1) University/type of university of user (large/small; research/teaching; old/new)
2) The referrer link used (e.g. Google, Metalib) – this says something about the kind of user and the visibility of the resource
3) Subject/discipline, according to the subject of e-book viewed.
3.4) Data Collection from:A representative sample of universities will be selected based on the practice each university has towards the sub-network labeling. If the universities allocate meaningful/accurate labels to their sub-networks then CIBER can readily identify academic status (staff/student) and subject characteristics from these sub-network labels.
Methodology: In addition, for a sample of universities, CIBER shall undertake these user analyses:
1) Academic status defined by sub-network used (thus in the case of SuperBook CIBER could identify Halls of Residence, which provide evidence of student use)
2) Department to which the user belongs, as defined by sub-network used to access the books
3) Geographical location of user defined by IP address (e.g. on campus, off campus)
Sub-network analyses can prove problematic and this will be cross checked and referenced with questionnaire data and log data linked to questionnaire data where this is feasible.
By combining the activity (usage) measurements with the information seeking characteristics and the user demographics we can establish the impact of the e-book roll out on a variety of scholarly communities, no matter how defined. It also means that, importantly, CIBER can identify diversity in take-up – groups of active/sophisticated users and at the opposite end of the scale low/non-users and establish reasons why and identify best practices.
Package 4. Observatory experiments
Timing: May to November 2008
Data Collection from: All subscribing institutions
Rationale:DLA provides opportunities for real-time experimentation and evaluation. During the life-time of the study universities will inevitably introduce new access and promotional initiatives in order to stimulate use/extend reach. Demand and use reports will inevitably bring good practices to participants attention and this will encourage innovation.
Methodology:Libraries and the aggregators will be asked to actively encouraged to experiment with new practices and to inform CIBER of any such experiments, promotional events planned etc. Initiatives can then be closely monitored through the logs and feed-back on the success or otherwise can be provided.
Package 5. Student focus groups
Timing: Interviewing during the Spring and Autumn terms 2008
Data Collection from: 8 case study institutions
Rationale: This work package is designed to collect qualitative data from undergraduate and postgraduate taught students on the following issues:
- Use data. To complement and enrich the data derived from the DLA, this study explores the students use of e-books, investigating, amongst other aspects: attitudes to the format/recognition of a discrete format; nature of use; positive and negative responses; and barriers to locating/use. The preliminary SuperBook study will have identified a wider array of pertinent issues and research questions.
- Creation and design of e-books. Students use of e-books is, in part, conditioned not only by the intellectual content, but also by their responses to the design. The package will investigate students responses to both the aesthetic and visual elements of the e-books as well as their structure, information architecture, and navigation.
- Impact on teaching and learning. Due to the short period of time between the integration of the e-book collections into the Universities and the instigation of the research project, it seems unlikely that this material will have been thoroughly embedded within the curricula, and teaching and learning pedagogies. Consequently, any robust investigation into the impact of e-books on teaching and learning is not possible at this juncture, although at a future time it is envisaged that experimental research using laboratory conditions could be facilitated (Work Package 8). However, base-line data about initial responses of students to information presented through e-books can be collected through the focus groups.
- Evaluation of promotion. Libraries engage in a range of different methods and strategies to promote e-books. Work Package 7 is designed to analyse the libraries approaches to promotion in order to inform best practice. This work package seeks to collect complimentary data by investigating students responses to modes of access (OPAC, web pages, etc) and responses to promotional programmes. Again, the qualitative study will supplement the deep log analysis of access routes.
- Informing pricing and licensing. There are several dimensions to the issue of pricing and licensing most of which will be explored with library staff through Work Package 7 (below) however, it is critical to survey students in order to establish their attitudes towards the purchase of print books and e-books, and to the licensing of e-books to them directly by publishers.
Methodology:Eight universities will be identified to represent each of the four disciplines: Business and Management; Engineering; Medicine; Media Studies. A total survey population in the order of 200 students selected from the institutions will be surveyed, comprising an appropriate selection of undergraduates and postgraduates and ensuring a mixture of academic years and gender. A precise delineation of variables will be determined following the results of the DLA. This will equate to 50 students for each discipline. A proven administrative methodology for identifying, contacting, and ensuring ethical access to students will be used. This methodology, which uses subject librarians and academic staff from the respective department, was employed for the duration of the JISC JUSTEIS project (Armstrong et al, 2001), and also addresses the demands of administering the focus group interview schedules and conducting the focus groups. There will be 3 groups for each university, each of about 8 students. A structured interview schedule will be distributed to participants in advance of the focus group. This will have been designed and tested as a part of the SuperBook study. The focus groups will be recorded and transcribed for subsequent analysis using NVivo 7; in addition, notes will be taken by the facilitator to ensure correct identification of contributions by participants.
Package 6. Academic staff focus groups
Timing: Interview timing is dependant upon Work Package 5, since they will be undertaken during a single visit to the institution
Data Collection from: 8 case study institutions
Rationale:
- Use data and impact on teaching and learning. Given the decision to focus on textbook and reading material, the study will not specifically explore the use of e-books by academic staff for research purposes. However, to complement the research on students use, another strand of the qualitative survey will be undertaken to explore staffs responses to making e-books available to students, and their expectations for e-books. To address the teaching and learning issues, we will also explore the ways in which e-books are integrated into the curriculum and teaching process.
- Creation and design of e-books. To complement the students responses to the design issues surrounding e-books, a study of academic staff as authors and teachers will be undertaken. This will explore their responses to the aesthetic and visual elements, and to its structure, information architecture, and navigation.
- Evaluation of promotion. Work Package 7 (below) is designed to analyse the libraries’ approaches to promotion in order to inform best practice. Work Package 6 seeks to collect complimentary data by investigating the responses of academic staff to modes of access (OPACs etc) and responses to promotional programmes.
Methodology:A sample population (target = 5) of academic staff from each of the selected departments within the 8 universities will be interviewed using focus groups. A structured schedule, which will be given to participants in advance, will be used and will have been devised and piloted as a part of SuperBook study. Interviews will be recorded and, with notes taken the facilitator, will be transcribed and analysed using NVivo 7. This methodology was successfully used in earlier JISC e-Book Studies (Armstrong, Edwards and Lonsdale, 2002).
Package 7. Librarian interviews
Timing: During the Spring and Summer terms 2008
Data Collection from: 8 case study institutions
Rationale: Two aspects of library work have particular import for the use that students can and will make of e-books: licensing and promotion. To gain a further understanding of student use, it is important to investigate the attitudes and work of library staff responsible for establishing, managing and promoting e-book collections.
- Informing pricing and licensing. Licensing and pricing models present a range of complex challenges for the librarian. While JISC Collections is in this project there are no licensing problems, but it is important to explore librarian responses to existing licensing/pricing models in the light of students own use and buying behaviour.
- Evaluation of promotion. In Package 5, the significance of library promotional strategies was mentioned, and this package is designed to elicit data about the nature of these strategies, the librarians perceptions of their own promotional programmes and the effectiveness of JISC & publishers promotional activities.
This will offer fundamental data against which to evaluate students perceptions of the effectiveness of promotional strategies and to inform best practice.
Methodology:To collect qualitative data, telephone interviews with library staff will be undertaken, using a structured schedule, which will be given to participants in advance. As in the previous work packages, these will be recorded, transcribed and analysed using NVivo 7.
Package 8. Establishing methods for future study of the impact of e-books
Timing: October to December 2008
Data Collection from:CIBER
Rationale:In Package 5, which collects base-line data on the impact of e-books on students learning, the difficulty of undertaking any comprehensive evaluation was highlighted, and the need to conduct future research in this neglected sphere of e-resource usage was identified. In addition to the data obtained by the students responses to issues surrounding the use, creation and design of e-books, there are dimensions to this field which have import for students learning which can only be explored through more demanding methodological approaches, such as observation or experimental research. There is a strong case to be made for future research into the areas surrounding learning and teaching and the impact of design on e-book use and efficacy. The JISC project offers an appropriate context in which to explore the methodological implications of future studies, and to complete the qualitative study a small-scale package has been designed to enable researchers to commence work on devising appropriate methodologies.
Methodology:This will involve an analysis of the literature on methodologies employed by researchers into e-resource and e-book usage in education (e.g. Hernon et al, 2006). An exploration of the data coming from the DLA will suggest areas for qualitative research.
Package 9: Obtaining data to inform pricing & licensing models
Timing: Ongoing throughout the study
Data Collection from:Aggregators and Publishers
Rationale:A concern on the part of publishers and aggregators is that widespread and unrestricted access to e-books might harm valuable hard-copy sales. There are, however, others (librarians) who alternatively believe the availability of e-books will result in increased sales of hard copy titles.
Methodology:In co-operation with the publishers and aggregators (and booksellers, where appropriate) we shall monitor hard-copy sales against deep log usage data for the same titles. A second questionnaire (Package 12) will also seek to determine whether e-book availability deterred students from buying them. Package 5 will also shed light here.
Package 10. Comparing circulation data with usage data
Timing: Second half of 2008
Data Collection from: All subscribing institutions
Rationale: Librarians and publishers need to know whether the availability of e-books impacts on the use of hard-copy equivalents, and also on related hard-copy titles.
Methodology: Libraries will be requested to supply historic and current circulation data so that a comparison may be made with the usage data. Package 5 will also shed light here.
Package 11. Closing user questionnaire
Timing: January 2009
Data Collection from: All subscribing institutions
Rationale: To obtain self-report data on success or otherwise of e-book initiative, specifically in regard to general levels of satisfaction and on the academic outcomes and impacts that result from e-book access (e.g. whether students also bought the e-books in hard-copy form).
Methodology:Online questionnaire to all users from the universities. The methodology used here will replicate that used in Work Package 2. A minority of the questions, such as those regarding awareness and attitudes to e-books, will be common to both surveys so that progress against earlier benchmarks can be measured. The focus here will however shift to the higher level objectives outlined above: determining the value that e-books have added to learning, teaching and research experiences of students and faculty. For this reason, it may be necessary to tailor this final questionnaire differently for students (focusing on learning outcomes) and for staff and graduate students (focusing on teaching and research outcomes), while retaining a common core.