Science is a way of trying not to fool yourself. The first principle is that you must not fool yourself, and you are the easiest person to fool.
Richard Feynman
Yesterday we were told that the much vaunted testing effect (which I’ve written about here) has been effectively shown to be useless in improving the learning of ‘complex’ material. Tamara van Gog and John Sweller’s provocatively titled paper, Not New, but Nearly Forgotten: the Testing Effect Decreases or even Disappears as the Complexity of Learning Materials Increases explored the ‘boundary conditions’ of the effect. The abstract of the paper says,
[One] potential boundary condition concerns the complexity of learning materials, that is, the number of interacting information elements a learning task contains. This insight is not new, as research from a century ago already had indicated that the testing effect decreases as the complexity of learning materials increases, but that finding seems to have been nearly forgotten. Studies presented in this special issue suggest that the effect may even disappear when the complexity of learning material is very high. Since many learning tasks in schools are high in element interactivity, a failure to find the effect under these conditions is relevant for education. Therefore, this special issue hopes to put this potential boundary condition back on the radar and provide a starting point for discussion and future research on this topic.
So, what is meant by ‘complexity’ and ‘interactivity’? Many studies of the testing effect have concentrated on the retention of simple nuggets of information which can be learned in isolation. As the paper puts it,
For example, when learning a list of new Spanish words with their English translation, one item, such as “gato–cat,” can be memorized without reference to another item, such as “perro–dog.” Or in history, the fact that “World War I began in 1914 and ended in 1918” can be learned without reference to another fact, like “The Netherlands was neutral in World War I.”
But when items to be learned interact with each other with one piece of information depending on an understanding of another, the topic to be learned can be said to be more complex.
For instance, when learning about the mechanics of a hydraulic car brake system in engineering, it is not only necessary to learn the individual components in the system (e.g., pistons, cylinders), but also how these components interact with each other (e.g., principles of hydraulic multiplication and friction). Moreover, the aim is usually not just to learn how the system works, but also to be able to apply that knowledge to real-world tasks (e.g., being able to diagnose and repair faults in a system).
Van Gog and Sweller “suggest that the complexity of learning materials may reduce or even eliminate … the testing effect.” The conclude their account by saying,
The studies collected in this special issue suggest that the complexity of learning materials might constitute another boundary condition of the testing effect, by showing that the effect decreases or even disappears with complex learning tasks that are high in element interactivity, which are plentiful in education.
Damn! Honestly, I hadn’t even considered that this boundary condition might exist. Does this mean we should we forget about the testing effect? Is it something which can only usefully be relied upon in psychology labs where psychology undergraduates are tested in simple, non-interactive facts? Van Go and Sweller are careful to point out that we shouldn’t be too hasty:
… the studies presented in this special issue suggest that it would be worthwhile to conduct further research on the complexity of learning and test tasks as a potential boundary condition of the testing effect. It would help teachers and instructional designers to know for which learning tasks they can and cannot expect benefits of having their students take practice tests instead of engage in further study. Therefore, we hope that this special issue encourages debate about and future research on the question of how the complexity of learning materials affects the testing effect.
And before you know it – the very next day in fact – Karpicke and Aue drop this little bombshell: The Testing Effect Is Alive and Well with Complex Materials. I haven’t actually been able to read it as it’s behind a paywall, but the abstract says,
Van Gog and Sweller (2015) claim that there is no testing effect—no benefit of practicing retrieval—for complex materials. We show that this claim is incorrect on several grounds. First, Van Gog and Sweller’s idea of “element interactivity” is not defined in a quantitative, measurable way. As a consequence, the idea is applied inconsistently in their literature review. Second, none of the experiments on retrieval practice with worked-example materials manipulated element interactivity. Third, Van Gog and Sweller’s literature review omitted several studies that have shown retrieval practice effects with complex materials, including studies that directly manipulated the complexity of the materials. Fourth, the experiments that did not show retrieval practice effects, which were emphasized by Van Gog and Sweller, either involved retrieval of isolated words in individual sentences or required immediate, massed retrieval practice. The experiments failed to observe retrieval practice effects because of the retrieval tasks, not because of the complexity of the materials. Finally, even though the worked-example experiments emphasized by Van Gog and Sweller have methodological problems, they do not show strong evidence favoring the null. Instead, the data provide evidence that there is indeed a small positive effect of retrieval practice with worked examples. Retrieval practice remains an effective way to improve meaningful learning of complex materials. [my emphasis]
So there you go. Make of this what you will. The lesson – if there is one – is that it pays to be skeptical and to withhold judgment rather than leaping to confirm our biases. Because seriously, what if you’re wrong?
Image courtesy of Shutterstock
But have complex learning tasks been tried in real situations? All learning is contingent. Learning a complex matter (e.g. the symptoms of a disease, or a structural fault in a building) can only really be reviewed in situ. The knowledge in these examples is often acquired and then tested within education. However, the proof of the pudding (or the worth and usefulness of the ‘retrieval’) is in the application. It’s all very well for psychologists to argue the toss in a lab, the real crunch is how practitioners use and apply this knowledge. This then leads to questions of how to teach these applications because, as I said, all learning is contingent.
Hi Michael – yes if you read the papers you can see that they refer to classroom studies. However, classroom studies usually tell us whatever we want to hear 😉
Everything is contingent on something, so of course that applies to learning too. In my new book I write about some of the ways teachers are using the testing effect in real classrooms (and btw – they’re not necessarily using tests.)
I don’t think I mean ‘classrooms’. So, in the example you give of a machine…We can all ‘know’ about machines but until you’re actually face to face with one, trying to figure out e.g. why it’s gone wrong, or how to design an improved one, we don’t actually know how useful our knowledge is. And then, we usually discover something else about the knowledge that we have.
I once sat in front of a kidney expert because my GP had diagnosed me with kidney failure. He got me to talk about life in general while he scrutinised my face. After a bit, he interrupted me and said abruptly that he thought the whole kidney theory was rubbish and diagnosed me as under-active thyroid. Then he asked in his students and asked them to diagnose me, and he left the room. They asked me a few things and the expert (consultant) came back into the room a few minutes later. So? he said, what’s Mr Rosen got? Kidney failure, they said. He blew up. ‘You’re only saying that because we’re in the kidney department (metabolic unit).
I’ve often thought about this in relation to ‘knowledge’. Both the consultant and the students ‘knew’ about kidney failure (symptoms, anatomy, physiology, pathology.) Both he and the students knew about thyroid failure (symptoms, anatomy, physiology, pathology). Yes he ‘knew’ more than they did, but it wasn’t really that ‘knowing more’ that enabled him to make the breakthrough – he was right by the way! It was that he was prepared to ‘think outside the box’. We can argue perhaps about why or how he could do that…(using knowledge of kidney and thyroid failure) but what he did next was illuminating. He got the students to conduct the very simplest and most basic of tests: touching my skin, checking my pulse rate, checking my knee reflex and asking me to ‘walk a line’ – all of which would show classic thyroid failure symptoms. He berated them for not doing these simple tests which they all ‘knew’.
That’s a fascinating anecdote and a wonderful example of the availability bias in action, thank you. I’ve written about how the availability heuristic is misused when we look at data here: https://www.learningspy.co.uk/myths/availability-bias-problem-data/
There is a conflict here perhaps in the nature of testing as it is applied in the papers above and the nature of testing as it is applied in education. The process we have in schools is that testing is the culmination of process – the evidence you present suggests that it should be the process – I would argue that this is the premise the of the AfL process.
I have been exploring some use of this in my classroom training primary teachers in science – most of these have little memory of science (they last “did” any formal learning in science at least 5 years before their PGCE and for many more than that) and so we are looking at using “micro-tests” and the end of each session to see how this impacts on their knowledge overall.
I do agree with Michael that it is in the application of this knowledge that the usefulness is most apparent – and the nature of testing in schools (mostly) is not around this – it is independent, focussed on memory, done under stress and for high stakes with little chance of redress – I would argue that the papers above demand we seriously re-think that nature of testing in schools – and challenge the process or terminal examinations designed to rank schools and pupils not least in light of the debate about the nature of complexity in knowledge – we fall foul of the McNamara effect time after time in school.
There’s a vast conceptual gulf between testing as an assessment practice and the psychological principle, the ‘testing effect’. Do follow my link to see the difference, but for the testing effect to be harnessing testing ought to be low on no stakes and done without any pressures or consequences. And it is categorically not concerned with gathering data so should not end up falling foul of McNamara’s fallacies.
Full copy of the The Testing Effect Is Alive and Well in PDF posted at Purdue University Cognition and Learning Lab linked at the top of this page…and no pay wall.
http://learninglab.psych.purdue.edu/publications/
Interesting read as Karpicke and Aue do not mince words.
[…] The Testing Effect is dead! Long live the Testing Effect! – David Didau: The Learning Spy. […]
[…] Science is a way of trying not to fool yourself. The first principle is that you must not fool yourself, and you are the easiest person to fool. Richard Feynman Yesterday we were told that the much vaunted testing effect (which I’ve written about here) has been effectively shown to be useless in improving the learning […]
The questions in the test are key to learning. A complex question which tests understanding and application as well as recall of facts if answered incorrectly or partially incorrectly by a student will when reanalysed by both student and teacher assist in identifying the students lack of knowledge/understanding in a specific area(s). The weak areas of knowledge/understanding can then be focussed upon by the student and teacher.Therefore testing is useful in the learning process.
[…] The Testing Effect is dead! Long live the Testing Effect! 20th May (629 views) […]
[…] most robust ways of increasing the flexibility and durability of what students know. This is the testing effect. Any test conducted in the classroom and marked by teachers will, to a greater or lesser extent, […]
[…] testing effect before here and have discussed some of the recent research evidence in more depth here. But for those who are understandably unwilling to trawl through my back catalogue, I’ll […]