On each results day for A levels and GCSEs, Ofqual publish a raft of accompanying data on their website including some that can be accessed as interactive visualisations. One of these enables users to select any pair of GCSE subjects (or even a combination of three) and compare the grade distributions for candidates that took both, based upon data provided by all of the awarding organisations in England. For example, you can review the grades of 11,860 students that took GCSE in both Drama and Biology and note that slightly more of them achieved grade 9 in Biology (13.6%) than in Drama (11.9%). From this you might jump to the conclusion that Biology is easier than Drama.
The issue with this interpretation of a single data visualisation from the Ofqual website is that it can mislead. Evidence from different pairs of subjects can be contradictory about their relative difficulty and, to truly understand what these statistics are telling us, we need to analyse them further. This is challenging, given the sheer number of pairs that can be considered using Ofqual’s presentation. There are 44 GCSE subjects in their data. Theoretically, this would allow for 946 pairs of GCSE subjects. Rising to that challenge, I spent a couple of hours of repetitive pointing and clicking and downloaded data on every one of the pairwise comparisons between GCSE subjects taken in summer 2024 and analysed them with a proper statistical method. In the end there were only 866 subject pairs to consider as Ofqual do not report data for pairs taken by fewer than 25 students.
The most comprehensive method for the analysis of subject pairs data is the Kelly method (see Kelly, 1975[i]). To use this method of analysis we first convert the grades achieved on GCSEs to numerical scores ranging from 0 to 9. The aim is to make it so that, on average, grades in any subject are no higher than the grades achieved in others by the same set of students. Specifically, the method calculates additive adjustments to grades so that, across the pairs taking any particular subject, the mean adjusted grade in one subject will equal the mean of the adjusted grades achieved across all paired subjects. These adjustments are known as the Kelly difficulty ratings.
The full results of applying this analysis to the data from all 866 pairwise visualisations are shown in Table 1. Subjects are ordered from the easiest to hardest estimated difficulty. The final column records the highest number of students recorded in any pair (usually in pairs with Mathematics or English language). This provides some idea of the relative popularity of subjects. Kelly difficulty ratings were scaled to indicate the estimated statistical difficulty of each subject relative to Mathematics.
If we naively accept the results in this table, the Russian GCSE is almost 3 grades easier than Mathematics whereas Engineering is one grade harder. Whilst interesting, even a cursory review of the results reveals why we do not simply adjust grade boundaries so that all subjects have the same statistical difficulty by this definition. It is notable that the statistically “easiest” GCSEs tend to be in Art subjects or minor modern languages. It is reasonable to expect that a high proportion of students entering these subjects are drawing on aptitudes and experiences gained outside of school, which might explain why they perform relatively well. Explaining why certain subjects are statistically “harder” is more challenging. However, it is notable that the three “hardest” subjects in Table 1 are all relatively rarely taken. If we return to the Drama/Biology example, taking all the data into account they appear to be of roughly equivalent difficulty.
Whilst we cannot rely on an entirely statistical approach to define how hard different subjects should be relative to one another, it remains an important topic. Statistics of the type published by Ofqual are just one element of this debate – a source of evidence, that I hope this blog has completely summarised. This means there’s no need for you to spend time staring at data from the hundreds of possible pairs individually, wondering what interpretation to take (unless you really want to).
For further reading on the difficulty of aligning subjects purely statistically see Bramley (2016)[ii]. For some reassurance regarding the effect that variable subject difficulty has on school accountability measures see Benton (2016)[iii].
Table 1: Estimated Kelly difficulties of GCSE subjects based upon publicly available data published by Ofqual
GCSE Subject |
Kelly Difficulty |
Maximum N in any subject pair |
Russian |
-2.82 |
1755 |
Polish |
-2.55 |
3845 |
Portuguese |
-2.00 |
1530 |
Italian |
-1.20 |
2605 |
Urdu |
-1.05 |
3290 |
Art_ Photography |
-0.74 |
38650 |
Arabic |
-0.68 |
2980 |
Chinese |
-0.41 |
3735 |
Art |
-0.24 |
49475 |
Food preparation and nutrition |
-0.24 |
53265 |
Art_ Fine art |
-0.22 |
61675 |
Art_ Textiles |
-0.17 |
12320 |
Art_ Graphics |
-0.14 |
8975 |
Film studies |
-0.10 |
5670 |
Media studies |
-0.01 |
26670 |
Mathematics |
0.00 |
705020 |
Religious studies |
0.00 |
197265 |
Dance |
0.01 |
6050 |
English language |
0.02 |
705020 |
English literature |
0.06 |
600190 |
Art_ 3D studies |
0.07 |
9430 |
Combined science |
0.09 |
448585 |
Drama |
0.14 |
44480 |
Biology |
0.15 |
169880 |
Citizenship studies |
0.15 |
17480 |
Physics |
0.21 |
169555 |
Chemistry |
0.22 |
169725 |
Physical education |
0.23 |
70465 |
Sociology |
0.28 |
29695 |
Design and technology |
0.28 |
72990 |
History |
0.31 |
300845 |
Geography |
0.32 |
273310 |
Business studies |
0.32 |
120315 |
Statistics |
0.41 |
22300 |
Music |
0.52 |
29470 |
Latin |
0.58 |
4810 |
Psychology |
0.78 |
19120 |
Computing |
0.80 |
87290 |
German |
0.80 |
32360 |
Spanish |
0.82 |
119715 |
French |
0.84 |
122340 |
Classical civilisation |
0.92 |
3185 |
Economics |
0.92 |
7325 |
Engineering |
1.21 |
2715 |
References
[i] Kelly, A. (1975). The relative standards of subject examinations. Research Intelligence, 1(2), 34–38.
[ii] Benton, T. (2016). On the impact of aligning the difficulty of GCSE subjects on aggregated measures of pupil and school performance. Research Matters: A Cambridge Assessment publication, 22, 27-30. /Images/374665-on-the-impact-of-aligning-the-difficulty-of-gcse-subjects-on-aggregated-measures-of-pupil-and-school-performance.pdf.
[iii] Bramley, T. (2016). The effect of subject choice on the apparent relative difficulty of different subjects. Research Matters: A Cambridge Assessment publication, 22, 23-26. /Images/374638-the-effect-of-subject-choice-on-the-apparent-relative-difficulty-of-different-subjects.pdf.