Ways of weighing up assessment load against outcome validity

Ways of weighing up assessment load against outcome validity - ARD Research Division webinar Image of James Beadle talking at conference
Date: 27 Mar 2025 Venue: Online
Time: 10:00 - 11:00
Type: Webinar Fee: Free

Decisions on how thoroughly to assess a candidate have always involved balancing the need for valid and reliable assessment outcomes against the potential burden on candidates. With growing concerns about student stress in the UK post-Covid, reducing assessment volume without compromising outcome validity has become an increasingly important area of focus. This webinar presents two recent projects from the Research Division that explore this balance.

Tom Benton - How long should a high stakes test be?

Tom Benton's research asks one of the most obvious questions in assessment design: if a test has a high stakes purpose, how long should it be? Firstly, he explores this question from a psychometric point of view starting from the (range of) minimum test reliability levels suggested in the academic literature. Then, by using published data on the typical relationship between the length, duration and reliability of exams, he develops a range of recommendations about the likely required duration of assessment. Secondly, to force deeper reflection on the results from the psychometric approach, he also compares the actual lengths of exams in England to those in other education systems around the world. Such comparisons reveal very wide variations in the amount of time young people are required to spend taking exams in different countries and at various ages. He concludes with some reflections on how the length of exams relates to the purpose of the assessment or to how its results will be used.

Tim Gill - The impact of reducing the number of exams on results in GCSEs

GCSE students are required to take a large number of exams at the end of their courses. For example, GCSE Maths is made up of three exam components, each of which is 1.5 hours long, which many stakeholders believe is an unnecessary amount of assessment. One simple way of reducing this would be to remove one of the assessments per subject (for example, by combining the content of the three components into two components/exams). However, a concern with this is that this might reduce the reliability and validity of the final grade because it would be based on fewer exams. Tim Gill's work finds that reducing the number of GCSE exam components generally maintains grading reliability, particularly in Maths where around 85% of candidates would have received the same grade if the number of components was reduced from three to two. This may reflect the different structure of Maths, where all topics are examined in all three components. However, variability was more pronounced in subjects with NEAs.

There will be opportunities for Q&A at the end of each presentation.

Register now

woman looking at mobile phone

Sign up for our mailing lists

Receive regular email updates about upcoming training, special offers, events and educational news.


Keep in touch