Flashcard Fundamentals: Choosing the Right Question

Memory retrieval is the process of fetching memories from storage and making them usable. Practising retrieval can be a fantastic aid to learning new information. Building a solid knowledge base is important in any classroom or training setting. Traditionally, this is done with flashcards, and while digital tools like MemoryLab have improved on the format, the core of flashcard-style learning has remained the same. There are two elements, questions and answers. There are subtleties to both, and this blog post will help you find the best ways to ask the best questions. 

There are two main types of questions in retrieval practice:

Recognition questions

The learner is presented with the answer alongside some distractors, and has to recognise the correct alternative

Eg. The season that follows winter is:
a) autumn
b) summer
c) spring
d) winter
Recall questions

The learner has to produce the answers themselves.

Eg. The season that follows winter is:


In order to choose the most appropriate type of question, it’s helpful to understand the relevant cognitive theories behind memory retrieval. 

According to the well-established transfer-appropriate processing (TAP) theory (Morris, et al., 1977), there is a strong relationship between how information is encoded and stored, and how it is later retrieved. Practising recall is likely to result in improved performance on subsequent recall tests. Similarly, practising recognition should make us better at retrieving information on recognition tests.

The successful-retrieval effect (Ellis, 1995) explains that successful retrievals during practice leads to better results in later tests, when compared to unsuccessful retrievals. Successful retrievals build pathways from the question to the answer, whereas unsuccessful retrievals build pathways towards incorrect answers. Recognition formats are simply easier than recall formats, which naturally means more successful retrievals – and therefore better test results – than recall formats. With easy content (or more specifically, high-success retrieval content), recall formats can receive the same benefits of this effect, but with harder content, the extra difficulty can lead to more pathways towards more incorrect answers. Additionally, all those wrong answers can be very demotivating. With this effect in mind, recognition formats are a wiser choice.

Conversely, the retrieval effort effect (Pyc & Rawson, 2009) posits that whichever format is most difficult will result in more activity in the parts of the brain responsible for the relevant information. This activity means pathways are being built in all directions – correct and incorrect, honing in on the truth – resulting in generally stronger networks surrounding that information. These stronger, more diverse networks are powerful tools in future tests. In defiance of the TAP theory, some research (eg. Webb, 2009) even shows that recall can outperform retrieval. This is likely due to the extra difficulty: The richer networks ease transfer between different testing formats. The transfer happening here can be explained by a similar effect, desirable difficulty, which also supports whichever format is more difficult – usually, recall. Practising with different types of questions might lead to more transfer of knowledge, too.

So, the relevant theories are somewhat contradictory. According to TAP, the question format during practice should match the test format, following the successful retrieval theory, easier multiple-choice questions are desirable, and according to the retrieval effort effect, more difficult open questions are the way to go. In practice, Nakata (2016) factored in the time spent studying and found that recognition was most efficient across all test formats, when tested directly after the practice and one week later. The exception, here, was orthographic information – that is, spelling – which was retrieved best after practising with recall. 

It’s not just the type of information, like spelling, that can make a difference in which format to choose. The optimal question format is actually quite sensitive to the context in which it’s asked, and to whom it is being asked. Baxter (2021) researched the effect of making multiple-choice questions more difficult by making the choices more similar. This is called contrasting. For the primary school children he was studying, he found that this only helped learners with a higher reading level. Nakata (2016) pointed out that researchers often compare studies conducted on different age groups, and this can lead to even more confusions and contradictions. Individual differences in learners and differences in testing can lead to differences in optimal question-making. Understanding specific students’ needs and setting specific goals (Eg. Practising spelling, as opposed to grammar) can show you when to use a more demanding question.

The benefits of achieving desirable difficulty are not limited to the recall format, and by making your recognition questions more difficult, your learners can still receive these benefits. Too difficult, and the number of successful retrievals goes down. Too easy, and retrieval is effortless. There are a number of dials that can be turned, buttons that can be pressed and levers that can be switched to fine-tune the difficulty of a question. Mastering these techniques can keep your learners working within the zone of proximal development, challenging them just enough to facilitate growth. 

To achieve desirable difficulty, how can we modulate the difficulty of a recognition question?

  • Making the content of the question easier or more difficult.
    • By providing clues or extra context, you can make difficult questions more likely to be answered correctly, and by stripping them away you can do the opposite.
  • Contrasting: Making the distractors more similar.
  • Switching between forwards and backwards associations.
  • If learners have practised with one direction of association (Eg, translating from English to Dutch), testing in that direction will be easier. Reversing this direction (Eg, practising English to Dutch; testing Dutch to English) makes the learner think backwards, which is considerably harder.

Combining these can lead to interesting results. Baxter et al., (2022), tested contrasting alongside forwards and backwards associations in language learning. They found that contrasting was only effective in one direction: 

If the learners knew English and were learning Dutch, contrasting would only be effective with Dutch distractors, because these unfamiliar words are harder to compare – and it is the difficult comparison between distractors that makes contrasting so good. If the distractors are well-understood, comparison between them is easier and contrasting is somewhat redundant. When the distractors are still being learned, contrasting can be used to pursue desirable difficulty.
Now that we can use question formatting to modulate difficulty, how can we know how difficult to make our questions? What level of difficulty is desirable? While you can estimate a class average, this is going to vary between each student and each topic. It will also change over time as those students learn. On a student-by-student level, there is too much variability for even the most attentive teachers. This is where adaptive learning shines. Specifically, systems that personalise the learning session to the learner. Algorithms can be that attentive – by interpreting data from each learner, from each question answered – they can know which questions are at the optimal difficulty for that learner at that moment. Your task becomes providing the algorithm with an array of questions that vary in difficulty, so that it can choose the right one for the right moment.


Baxter, P., Bekkering, H., Dijkstra, T., Droop, M., Van Den Hurk, M., & Leoné, F. (2022).

Contrasting orthographically similar words facilitates adult second language vocabulary learning. Learning and Instruction, 80, 101582. https://doi.org/10.1016/j.learninstruc.2022.101582 

Baxter, P., Droop, M., Van Den Hurk, M., Bekkering, H., Dijkstra, T., & Leoné, F. (2021). Contrasting Similar Words Facilitates Second Language Vocabulary Learning in Children by Sharpening Lexical Representations. Frontiers in Psychology, 12, 688160. https://doi.org/10.3389/fpsyg.2021.688160

Ellis, N. C. 1995. The psychology of foreign language vocabulary acquisition: Implications for CALL. Computer Assisted Language Learning, 8. 103–128. https://doi.org/10.1080/0958822940080202 

Morris, C. D., Bransford, J. D. & J. J. Franks. (1977). Levels of processing versus transfer appropriate processing. Journal of Verbal Learning and Verbal Behavior, 16. 519–533.https://doi.org/10.1016/S0022-5371(77)80016-9 

Nakata, T. (2016). Effects of retrieval formats on Second language vocabulary learning. International Review of Applied Linguistics in Language Teaching, 54(3), 257–289. https://doi.org/10.1515/iral-2015-0022 

Pyc, M. A. & K. A. Rawson. (2009). Testing the retrieval effort hypothesis: Does greater difficulty correctly recalling information lead to higher levels of memory? Journal of Memory and Language, 60. 437–447. https://doi.org/10.1016/j.jml.2009.01.004 

Webb, S. A. (2009). The effects of pre-learning vocabulary on reading comprehension and writing. The Canadian Modern Language Review, 65(3), 441–470. https://doi.org/10.3138/cmlr.65.3.441 

We zijn nu MemoryLab (voorheen SlimStampen). Deze wijziging weerspiegelt onze visie om ons bereik wereldwijd uit te breiden en bij te dragen aan onderwijsonderzoek. Meer lezen

Offerte aanvragen?

Vul het onderstaande formulier in:

Of mail naar:

Aan de slag!

Wilt u meer weten?

Vul het onderstaande formulier in:

Of mail naar: