I reviewed all 866 of Ofqual’s subject pairs visualisations (so you don’t have to)

by Tom Benton, 05 November 2024
Decorative image

On each results day for A levels and GCSEs, Ofqual publish a raft of accompanying data on their website including some that can be accessed as interactive visualisations. One of these enables users to select any pair of GCSE subjects (or even a combination of three) and compare the grade distributions for candidates that took both, based upon data provided by all of the awarding organisations in England. For example, you can review the grades of 11,860 students that took GCSE in both Drama and Biology and note that slightly more of them achieved grade 9 in Biology (13.6%) than in Drama (11.9%). From this you might jump to the conclusion that Biology is easier than Drama.

The issue with this interpretation of a single data visualisation from the Ofqual website is that it can mislead. Evidence from different pairs of subjects can be contradictory about their relative difficulty and, to truly understand what these statistics are telling us, we need to analyse them further. This is challenging, given the sheer number of pairs that can be considered using Ofqual’s presentation. There are 44 GCSE subjects in their data. Theoretically, this would allow for 946 pairs of GCSE subjects. Rising to that challenge, I spent a couple of hours of repetitive pointing and clicking and downloaded data on every one of the pairwise comparisons between GCSE subjects taken in summer 2024 and analysed them with a proper statistical method. In the end there were only 866 subject pairs to consider as Ofqual do not report data for pairs taken by fewer than 25 students.


The most comprehensive method for the analysis of subject pairs data is the Kelly method (see Kelly, 1975[i]). To use this method of analysis we first convert the grades achieved on GCSEs to numerical scores ranging from 0 to 9. The aim is to make it so that, on average, grades in any subject are no higher than the grades achieved in others by the same set of students. Specifically, the method calculates additive adjustments to grades so that, across the pairs taking any particular subject, the mean adjusted grade in one subject will equal the mean of the adjusted grades achieved across all paired subjects. These adjustments are known as the Kelly difficulty ratings.

The full results of applying this analysis to the data from all 866 pairwise visualisations are shown in Table 1. Subjects are ordered from the easiest to hardest estimated difficulty. The final column records the highest number of students recorded in any pair (usually in pairs with Mathematics or English language). This provides some idea of the relative popularity of subjects. Kelly difficulty ratings were scaled to indicate the estimated statistical difficulty of each subject relative to Mathematics.

If we naively accept the results in this table, the Russian GCSE is almost 3 grades easier than Mathematics whereas Engineering is one grade harder. Whilst interesting, even a cursory review of the results reveals why we do not simply adjust grade boundaries so that all subjects have the same statistical difficulty by this definition. It is notable that the statistically “easiest” GCSEs tend to be in Art subjects or minor modern languages. It is reasonable to expect that a high proportion of students entering these subjects are drawing on aptitudes and experiences gained outside of school, which might explain why they perform relatively well. Explaining why certain subjects are statistically “harder” is more challenging. However, it is notable that the three “hardest” subjects in Table 1 are all relatively rarely taken. If we return to the Drama/Biology example, taking all the data into account they appear to be of roughly equivalent difficulty.


Whilst we cannot rely on an entirely statistical approach to define how hard different subjects should be relative to one another, it remains an important topic. Statistics of the type published by Ofqual are just one element of this debate – a source of evidence, that I hope this blog has completely summarised. This means there’s no need for you to spend time staring at data from the hundreds of possible pairs individually, wondering what interpretation to take (unless you really want to).

For further reading on the difficulty of aligning subjects purely statistically see Bramley (2016)[ii]. For some reassurance regarding the effect that variable subject difficulty has on school accountability measures see Benton (2016)[iii].

Table 1: Estimated Kelly difficulties of GCSE subjects based upon publicly available data published by Ofqual

GCSE Subject Kelly Difficulty Maximum N in any subject pair
Russian -2.82 1755
Polish -2.55 3845
Portuguese -2.00 1530
Italian -1.20 2605
Urdu -1.05 3290
Art_ Photography -0.74 38650
Arabic -0.68 2980
Chinese -0.41 3735
Art -0.24 49475
Food preparation and nutrition -0.24 53265
Art_ Fine art -0.22 61675
Art_ Textiles -0.17 12320
Art_ Graphics -0.14 8975
Film studies -0.10 5670
Media studies -0.01 26670
Mathematics 0.00 705020
Religious studies 0.00 197265
Dance 0.01 6050
English language 0.02 705020
English literature 0.06 600190
Art_ 3D studies 0.07 9430
Combined science 0.09 448585
Drama 0.14 44480
Biology 0.15 169880
Citizenship studies 0.15 17480
Physics 0.21 169555
Chemistry 0.22 169725
Physical education 0.23 70465
Sociology 0.28 29695
Design and technology 0.28 72990
History 0.31 300845
Geography 0.32 273310
Business studies 0.32 120315
Statistics 0.41 22300
Music 0.52 29470
Latin 0.58 4810
Psychology 0.78 19120
Computing 0.80 87290
German 0.80 32360
Spanish 0.82 119715
French 0.84 122340
Classical civilisation 0.92 3185
Economics 0.92 7325
Engineering 1.21 2715

References

[i] Kelly, A. (1975). The relative standards of subject examinations. Research Intelligence, 1(2), 34–38.

[ii] Benton, T. (2016). On the impact of aligning the difficulty of GCSE subjects on aggregated measures of pupil and school performance. Research Matters: A Cambridge Assessment publication, 22, 27-30. /Images/374665-on-the-impact-of-aligning-the-difficulty-of-gcse-subjects-on-aggregated-measures-of-pupil-and-school-performance.pdf.

[iii] Bramley, T. (2016). The effect of subject choice on the apparent relative difficulty of different subjects. Research Matters: A Cambridge Assessment publication22, 23-26. /Images/374638-the-effect-of-subject-choice-on-the-apparent-relative-difficulty-of-different-subjects.pdf.

Before you go... Did you find this article on Twitter, LinkedIn or Facebook? Remember to go back and share it with your friends and colleagues!

Related blogs

Key bloggers

Tim Oates Blogger
Tim Oates

Director of Assessment and Research

Research Matters

Research Matters 32 promo image

Research Matters is our free biannual publication which allows us to share our assessment research, in a range of fields, with the wider assessment community.