Nicholas Raikes

Nicholas Raikes

Unusually for Cambridge, I was born and brought up here, and started working for Cambridge University Press & Assessment as a temp when home for the summer from Liverpool John Moores University. I graduated with a BSc in Applied Physics, and now also have specialist qualifications in computing and education from the universities of Bradford and Bristol respectively.

After graduation I briefly took up a post in question paper preparation at Cambridge University Press & Assessment, but swiftly joined the Research and Evaluation division where I worked on, and in due course led, many studies in areas such as inter-board comparability, standards over time, equity and qualification evaluation. I subsequently worked extensively on the development and introduction of on-screen marking and on operationalising data analytics. With over twenty years’ experience in technical, research, and leadership roles at Cambridge University Press & Assessment, I now lead a programme of work in innovation and development.

Outside of work I enjoy foreign travel, hiking, theatre and concerts.

Publications

2019

Data, data everywhere? Opportunities and challenges in a data-rich world

Raikes, N. (2019).  Data, data everywhere? Opportunities and challenges in a data-rich world. Research Matters: A Cambridge Assessment publication, 27, 16-19.

Data has always been central to large-scale educational assessments, such as those provided by Cambridge Assessment, but digital learning and assessment products vastly increase the amount, types, and immediacy of data available.  Advances in data science and technology have opened up this data for analysis as never before and “big data” is much hyped by both those that fear it and those that welcome it. But, as yet, there has been no big data revolution in education. In this article, I briefly outline current uses of data analytics at Cambridge Assessment, and how big data extends them. I explore some likely future uses and benefits of big data in relation to formative assessment and individualised recommendations for learners, together with the risks of statistical naivety and data misuse, and the challenge of gaining the trust of students, parents and teachers.

2018

Data, data everywhere? Opportunities and challenges in a data-rich world
Raikes, N. (2018). Data, data everywhere? Opportunities and challenges in a data-rich world. Presented at the 44th conference of the International Association for Educational Assessment, Oxford, UK, 9-14 September 2018. Video link:

https://www.cambridgeassessment.org.uk/news/video/view/data-data-everywhere/

2012

Making the most of our assessment data: Cambridge Assessment's Information Services Platform

Raikes, N. (2012). Making the most of our assessment data: Cambridge Assessment's Information Services Platform. Research Matters: A Cambridge Assessment publication, 13, 38-40.

As new technologies penetrate every part of educational assessment, data is being collected as never before.  Traditionally, there were two methods of producing statistical information within Cambridge Assessment.  Routine statistical information came from reports built into bespoke examination processing systems.  Non-routine analysis and reports were produced by small teams of statistical experts, typically working within research units and using statistical software packages on personal computers.  With increasing demand for flexible, high-volume statistical reporting, a new solution was required; one which combined the resilience and scalability of a server-based infrastructure with the flexibility of having statistical experts in charge of creating the statistical content.  The Information Services Platform (ISP) is Cambridge Assessment’s solution for these requirements.  It provides our statistical experts with access to operational assessment data and tools to automate and schedule analysis and report, and to publish the resulting content on an Intranet Portal for use by colleagues across the organisation.  In this paper, I discuss further the thinking behind the ISP and give practical examples of use.

2011

Making the most of our assessment data: Cambridge Assessment’s Information Services Platform
Raikes, N. (2011).  Presented at the 37th annual conference of the International Association for Educational Assessment (IAEA), Manila, Philippines, 23-28 October 2011.
Evaluating Senior Examiners' use of Item Level Data

Shiell, H. and Raikes, N. (2011). Evaluating Senior Examiners' use of Item Level Data. Research Matters: A Cambridge Assessment publication, 12, 7-10.

Many of Cambridge Assessment's written examination scripts are now scanned and marked on screen by examiners working on computers. One benefit arising from on-screen marking is that the marks are captured at item or question-part level and are available for analysis in Cambridge within hours of being submitted by examiners. Cambridge Assessment now routinely analyses these item marks and provides subject staff and senior examiners with reports containing Item Level Data (ILD) for nearly all examinations marked on screen. In this article, we present findings from an evaluation of senior CIE and OCR examiners’ use of these Item Level Data reports.

2010

Must examiners meet in order to standardise their marking? An experiment with new and experienced examiners of GCE AS Psychology

Raikes, N., Fidler, J. and Gill, T. (2010). Must examiners meet in order to standardise their marking? An experiment with new and experienced examiners of GCE AS Psychology. Research Matters: A Cambridge Assessment publication, 10, 21-27.

When high-stakes examinations are marked by a panel of examiners, the examiners must be standardised so that candidates are not advantaged or disadvantaged according to which examiner marks their work.

It is common practice for Awarding Bodies’ standardisation processes to include a “Standardisation” or “Co-ordination” meeting, where all examiners meet to be briefed by the Principal Examiner and to discuss the application of the mark scheme in relation to specific examples of candidates’ work.  Research into the effectiveness of standardisation meetings has cast doubt on their usefulness, however, at least for experienced examiners.  

In the present study we addressed the following research questions:

1. What is the effect on marking accuracy of including a face-to-face meeting as part of an examiner standardisation process?
2. How does the effect on marking accuracy of a face-to-face meeting vary with the type of question being marked (short-answer or essay) and the level of experience of the examiners?
3. To what extent do examiners carry forward standardisation on one set of questions to a different but very similar set of questions?

2009

Must examiners meet in order to standardise their marking? An experiment with new and experienced examiners of GCE AS Psychology
Raikes, N., Fidler, J. and Gill, T. (2009). Presented at the British Educational Research Association (BERA) Annual Conference, Manchester, UK, 5 September 2009.
Grading examinations using expert judgements from a diverse pool of judges

Raikes, N., Scorey, S. and Schiell, H. (2009). Grading examinations using expert judgements from a diverse pool of judges. Research Matters: A Cambridge Assessment publication, 7, 4-8. 

In normal procedures for grading GCE Advanced level and GCSE examinations, an Awarding Committee of senior examiners recommends grade boundary marks based on their judgement of the quality of scripts, informed by technical and statistical evidence.  The aim of our research was to investigate whether an adapted Thurstone Pairs methodology (see Bramley and Black, 2008; Bramley, Gill and Black, 2008) could enable a more diverse range of judges to take part.  The key advantage of the Thurstone method for our purposes is that it enables two examinations to be equated via judges making direct comparisons of scripts from both examinations, and does not depend on the judges’ internal conceptions of the standard required for any grade.

A General Certificate of Education (GCE) Advanced Subsidiary (AS) unit in biology provided the context for the study reported here.  The June 2007 and January 2008 examinations from this unit were equated using paired comparison data from the following four groups of judges: members of the existing Awarding Committee; other examiners that had marked the scripts operationally; teachers that had taught candidates for the examinations but not marked them; and university lecturers that teach biology to first year undergraduates.

We found very high levels of intra-group and inter-group reliability for the scales and measures estimated from all four groups’ judgements.  When boundary marks for January 2008 were estimated from the equated June 2007 boundaries, there was considerable agreement between the estimates made from each group’s data.  Indeed for four of the boundaries (grades B, C, D and E), the estimates from the Awarders’, examiners’ and lecturers’ data were no more than one mark apart, and none of the estimates were more than three marks apart. 

We concluded that the examiners, teachers, lecturers and members of the current Awarding Committee made very similar judgments, and members of all four groups could take part in a paired comparison exercise for setting grade boundaries without compromising reliability.

2008

Grading examinations using expert judgements from a diverse pool of judges
Raikes, N., Scorey, S. and Shiell, H. (2008). Presented at the 34th annual conference of the International Association for Educational Assessment (IAEA), Cambridge, UK, 7-12 September 2008.

2007

Item-level examiner agreement

Raikes, N. and Massey, A. (2007). Item-level examiner agreement. Research Matters: A Cambridge Assessment publication, 4, 34-37.

Studies of inter-examiner reliability in GCSE and A-level examinations have been reported in the literature, but typically these focused on paper totals, rather than item marks. See, for example, Newton (1996). Advances in technology, however, mean that increasingly candidates’ scripts are being split by item for marking, and the item-level marks are routinely collected. In these circumstances there is increased interest in investigating the extent to which different examiners agree at item level, and the extent to which this varies according to the nature of the item. Here we report and comment on intraclass correlations between examiners marking sample items taken from GCE A-level and IGCSE examinations in a range of subjects. The article is based on a paper presented at the 2006 Annual Conference of the British Educational Research Association.

Quality control of examination marking

Bell, J. F., Bramley, T., Claessen, M. J. A. and Raikes, N. (2007). Quality control of examination marking. Research Matters: A Cambridge Assessment publication, 4, 18-21.

As markers trade their pens for computers, new opportunities for monitoring and controlling marking quality are created. Item-level marks may be collected and analysed throughout marking. The results can be used to alert marking supervisors to possible quality issues earlier than is currently possible, enabling investigations and interventions to be made in a more timely and efficient way. Such a quality control system requires a mathematical model that is robust enough to provide useful information with initially relatively sparse data, yet simple enough to be easily understood, easily implemented in software and computationally efficient – this last is important given the very large numbers of candidates assessed by Cambridge Assessment and the need for rapid analysis during marking. In the present article we describe the models we have considered and give the results of an investigation into their utility using simulated data.

2006

Item level examiner agreement
Massey, A.J. and Raikes, N. (2006) Presented at the British Educational Research Association (BERA) annual conference, Warwick, UK, 9 September 2006.
The Cambridge Assessment/Oxford University automatic marking system: Does it work?

Raikes, N. (2006). The Cambridge Assessment/Oxford University automatic marking system: Does it work? Research Matters: A Cambridge Assessment publication, 2, 17-21.

In the first issue of Research Matters, Sukkarieh et al. (2005) introduced our work investigating the automatic marking of short, free text answers to examination questions. In this article, I give details and results of an evaluation of the final prototype automatic marking system that was developed.

Quality control of marking: Some models and simulations
Bell, J.F., Bramley, T., Claessen, M.J.A. and Raikes, N. (2006). Presented at the 32nd annual conference of the International Association for Educational Assessment (IAEA), Singapore, 21-26 May 2006.

2005

Automatic marking of short, free text responses

Sukkarieh, J. Z., Pulman, S. G., and Raikes, N. (2005). Automatic marking of short, free text responses. Research Matters: A Cambridge Assessment publication, 1, 19-22.

Many of UCLES' academic examinations make extensive use of questions that require candidates to write one or two sentences. With increasing penetration of computers into schools and homes, a system that could partially or wholly automate valid marking of short, free text answers typed into a computer would be valuable, but would seem to pre-suppose a currently unattainable level of performance in automated natural language understanding. However, recent developments in the use of so-called ‘shallow processing’ techniques in computational linguistics have opened up the possibility of being able to automate the marking of free text without having to create systems that fully understand the answers. With this in mind, UCLES funded a three year study at Oxford University. Work began in summer 2002, and in this paper we introduce the project and the information extraction techniques used. A further paper in a forthcoming issue of Research Matters will contain the results of our evaluation of the automatic marks produced by the final system.

2004

Auto-marking 2: An update on the UCLES-Oxford University research into using computational linguistics to score short, free text responses.
Sukkarieh, J.Z., Pulman, S.G. and Raikes, N. (2004). Presented at the 30th annual conference of the International Associations for Educational Assessment (IAEA), Philadelphia, USA, 13-18 June 2004.
From Paper to Screen: some issues on the way
Raikes, N., Greatorex, J. and Shaw, S. (2004). Presented at the 30th annual conference of the International Associations for Educational Assessment (IAEA), Philadelphia, USA, 13-18 June 2004.

2003

The horseless carriage stage: replacing conventional measures
Raikes, N. and Harding, R. (2003). The horseless carriage stage: replacing conventional measures. Assessment in Education: Principles, Policy & Practice, 10(3), 267-277.
Auto-Marking: Using Computational Linguistics to Score Short, Free Text Responses
Sukkarieh, J., Pulman, S. and Raikes, N. (2003). Presented at the 29th annual conference of the International Association for Educational Assessment (IAEA), Manchester, UK, October 2003.

2002

On Screen Marking of Scanned Paper Scripts
Raikes, N. (2002) University of Cambridge Local Examinations Syndicate

1998

Investigating A-level mathematics standards over time
Bell, J.F., Bramley, T. and Raikes, N. (1998).  Investigating A level mathematics standards over time. British Journal of Curriculum and Assessment, 8, 2, 7-11.

1997

Standards in A level Mathematics 1986-1996
Bell J F., Bramley, T. and Raikes, N. (1997). Presented at the British Educational Research Association (BERA) annual conference, York, UK, 11-14 September 1997.

Research Matters

Research Matters 32 promo image

Research Matters is our free biannual publication which allows us to share our assessment research, in a range of fields, with the wider assessment community.