Why knowledge-based exams are harmful to the goals of education.
Centuries of research have demonstrated that the best long-term retention of educational materials is achieved when the materials are studied in a spaced fashion. This means that students should not invest the time they have available for studying in a single study session, but instead should distribute the total time over multiple shorter sessions, distributed (or “spaced out”) over time. However, as most of us know – it's very tempting to study during the last days, or even hours or minutes before the test. Here I will argue that this is not due to laziness, or due to adolescents lacking the brainpower to allow for better planning, but that this is due to students being rational. And because of this, students are demonstrating behavior that is optimal for a task that the educational system instills upon them, yet goes against the goals of education.
What goals does education have? From the perspective of society, one of the core goals of education is to equip students with the required knowledge and skills that will enable them a successful future. However, for many students the goal is much more straightforward: “to pass with the highest possible grade”. Or, in the case of less motivated students: “to pass with the minimum passing grade”. Even though we might consider both types of students as very distinct, they share the principle of rationality: optimize their grade with the least effort invested, as all resources not used can be invested in other courses, or in hobbies, sports, or social activities .
Obviously, a teacher might stress repeatedly that one does not study for the grade on an exam, but for one's own future. However, whereas the publication of exam grades in the gradebook is a very concrete event that will happen in the near future, foreseeing that vocabulary items French might be needed in a business setting in a far-away future is less easily imaginable. Moreover, society evaluates the academic success of students in the hard currency of their average grade. Without a certain grade level, further educational options or job opportunities might simply be out of reach. In other words, it is not at all surprising that students aim to optimize towards the highest possible grades.
Yet, as we alluded to in the beginning of this post, spacing yields much better long-term retention. What explains this dissociation between this scientific observation, and the realities of real-world learning? The key is the distinction between “long-term retention”, and the single assessment moment of an exam. Because, indeed, when measured over a retention time span of weeks to months to years, spaced practice yields much higher payout than a night of cramming just before the exam. However, spacing is detrimental when studying for a high grade on a test. A simple example will make this clear: Let's assume that a secondary education pupil can set aside 45 minutes to study for a list of 20 French vocabulary items. Second, for simplicity's sake, let's assume that there are two options to schedule those 45 minutes: This time could either be distributed in three blocks of 15 minutes studying, say one session two weeks before, one session one week before, and one session in the break just before the test; or studying for 45 minutes straight in the break just before the test. What would result in the highest grade on the exam? The data is clear (eg, Cepeda et al, 2008): the grade is clearly higher after 45 minutes of cramming, which makes sense, because the cramming was so recent, that the information has not really had the time to decay. However, the data is also clear that this type of cramming is terrible for long term retention: Just a couple of hours after the cramming, a large proportion of the just studied materials will have already been forgotten as a single cramming session will not result in stable long-term consolidation. Indeed, in line with the advice of many teachers, the 3 x 15 minutes scheme will have resulted in a much better score when a surprise test was given after a couple of weeks. Thus, students are faced with an impossible dilemma. Do they want to aim for a “Cum Laude” grade, but have less than optimal performance in terms of long-term retention given their invested time, or do they want to aim for the best possible long-term retention, but make do with a suboptimal grade?
Any for-grade exam for which materials need to be committed to memory will induce this dilemma. And it's therefore clear: such exams are detrimental for long-term retention, one of the goals of education. The solution to this problem is straightforward: when exams are the problem, let's get rid of exams.
It might not be surprising that students love this message (a Dutch TikTok video covering this work was watched 430,000 times in the first 24 hours). Yet, removing exams does not mean that materials do not need to be studied. However, instead of studying for a test at a predetermined time, students should instead study until – during studying – it is assessed they know the materials. One, highly impractical, approach is to assign a teacher to each student. Another approach, which we have successfully developed and tested in a number of different school settings is to have students learn using an adaptive learning system that indicates when they know the materials well enough. Such systems should, obviously, be well designed to account for the wide spectrum of cognitive capacities and be optimally inclusive, but when meeting these criteria, they could replace the for-grade exams. Instead of needing to demonstrate performance during an arbitrary time slot of a school day, students could be required to study the materials two or three times, with sufficient spacing between the sessions, to a level of mastery defined by their teacher. A student with a knack for remembering French vocabulary might be done sooner than a student who struggles with French, but both will have control over their own studying, and neither will need to be nervous for a blackout during the exam.
At the University of Groningen and at Memory Lab (together with the MSA, Utrecht University, and NOLAI), we have been working on developing and testing this approach. Using MemoryLab's online adaptive learning systems, students can study using retrieval practice (or recall-based methods) until their performance during studying indicates that their level of mastery is sufficient for that session. Our work demonstrates that with this approach, the knowledge levels are actually higher than when requiring students to learn for a test.
Therefore, both on theoretical and practical grounds, it is time to abolish traditional quizzes on factual materials, and replace them with adaptive learning-based studying methods to equip students in the most efficient way for the best possible future.
Cepeda, N. J., Vul, E., Rohrer, D., Wixted, J. T., & Pashler, H. (2008). Spacing Effects in Learning: A Temporal Ridgeline of Optimal Retention. Psychological Science, 19(11), 1095–1102. https://doi.org/10.1111/j.1467-9280.2008.02209.x