Regular readers will know I’ve been ploughing a furrow on this question for quite a while now. Last June I synthesised my thinking in this post: Deliberately difficult – why it’s better to make learning harder. For those of you who might be unfamiliar with the arguments, I’ll summarise them briefly:

– Learning is different from performance (the definition of learning I’m using here is the long-term retention and transfer of knowledge and skills)
– We can’t actually see learning happen; we can only infer it from performance
– Performance is a very poor indicator of learning
– Reducing performance might actually increase learning

This is deeply counter-intuitive and runs against the grain of what goes on in the overwhelming majority of schools. So much so that Ofsted have enshrined the need for schools to demonstrate that pupils make ‘rapid and sustained’ progress. I argued in this post that you can’t have both; rapid progress comes at the cost of sustained progress. But unless you’ve accepted the argument that learning is invisible, this is possibly hard to swallow. But once you’ve passed through this particular threshold, it changes everything!

The evidence is compelling. Last year Soderstrom & Bjork put together a literature review which collated decades of research and hundreds of studies which support the idea that learning is separate from performance and, more troublingly, that performance might not lead to learning. This has led me to the conclusion that grading lessons is wrong and also that much of what we consider to be outstanding practice encourages a focus on rapid performance gains at the expense of sustained learning. Hence the inexorable rise of The Cult of Outstanding.

Still with me? Good. Now hear comes the kicker. In the comments to my Cult of Outstanding post, Kris Boulton raises the problem that introducing desirable difficulties at the point of acquisition appears incongruous. How does that square with the need to minimise cognitive load?

This brings us to the Willingham Thesis:

  • Working memory is severely limited
  • Experts think differently to novices
  • Our brains are not well designed for thinking, instead we rely on memorised schema to solve complex problems.

The reason students don’t like school is, according to Willingham, that we make them think too much, and thinking is hard. Experts rely on retrieving whole schema (connected items of information) from long-term memory to get around the limits on our fragile working memories. Novices don’t have these memorised schema to rely on so attempt to hold too much information in their working memories which leads to cognitive overload. This explains the need to teach number bonds: unless pupils have memorised the fact that 3+7=10 they’ll have to calculate it anew each time they encounter a problem containing these numbers. When calculating a complex sum, pupils forget the answer they have arrived at for the first part by the time they get to end because they’re trying to hold too much in their minds at the same time. This is why Why minimal guidance during instruction does not work.

In my post on metacognition I outline another explanation for this: not only do experts know more, they also think differently.  Novices concentrate on the detail of a problem and ignore its structure. The neural architecture of an expert appears different to that of a novice. So when we ask novices to approach problems like experts their lack of knowledge storied in long-term memory prevents them from being about to think matacognitively.

This would appear to be a clear cut case for making learning easier. In her article Making Learning Easier And Making It Harder: Both Are Necessary, Annie Murphy Paul attempts to force these colliding perspectives together with the recommendation that we do both. But is that really possible? Is this a real or a false dichotomy? I always find duality a useful thinking tool and instinctively want to see this a situation where you can’t just do a bit of both. Paul suggests we reduce cognitive load at the point of instruction and only introduce difficulties once new concepts have been mastered. What she’s suggesting, as far as I can understanding is that first we make learning easy, and then we make it hard. But is that actually possible? How do we actually get our pupils to the point of mastery? The research seems to suggest it’s the struggle that enables us to learn. If we’re waiting for the learning to happen before providing the conditions that best allow learning to occur won’t we find ourselves in a hopeless muddle?

“But luckily we have critique protocols.” Ron Berger

Dylan Wiliam got in touch via Twitter to critique Paul’s post and point out a missing piece of the puzzle:

Screen Shot 2014-05-11 at 13.18.42

This is something I’ve been thinking about for some time: most teaching is about improving retrieval strength. We provide cues and contexts that make it easier for pupils to retrieve information during instruction. But how do we improve storage strength? And once we’ve stored information, what then This is, I think where the desirable difficulties of spacing and interleaving come into their own.

This then is my attempt to reconcile the irreconcilable. First we need to get information in, then we need to make it stick.

We get it in through explicit explanation and modelling. I can’t see much benefit to giving students garbled information (although according to this study charisma and quality of exposition may have little bearing on how well information is retained, so I could be wrong.) The trouble is that left to our own devices we will subsequently forget about 70% of everything we’ve learned.

Screen Shot 2014-05-11 at 14.06.11

Quote from Nuthall (2005), graph from Ebbinghaus

UPDATE: Dylan Wiliam suggested this is a better representation of Ebbinghaus’s 1885 data:

Ebbinghaus

Once new information has been encountered, we know need to make it stick. The least useful thing to do is to teach or review what’s just been taught: if we want if to be retained we’re better off using testing, generation, spaced retrieval practice and interleaving different topics. This is deeply counter-intuitive because it feels hard. If study is easy it produces the illusion of knowing; we think we’re learning. Reducing performance makes us feel like we’re not getting any better, but the evidence is, time and time again, that making learning deliberately difficult makes learn better. And weirdly, forgetting makes space for us to better store information. Items we’ve not practised retrieving are more likely to be forgotten in the short term, but, forgetting increases chances of retaining information that is recalled in spaced retrieval practice.

But why? Well, when asked what he thought a good proxy for learning might look like, Professor Coe came up with this: “Learning happens when you think hard about subject content.” In struggling to piece information together into schema we have to struggle with what we’re learning. If we’ve forgotten part of the schema we have to work hard to dredge it up from long-term memory. It’s all real effort and we make mistakes and get it wrong. It feels like we’re not getting better. But we are. Arguably, we might be able to remember up to 90% of what we learn by taking advantage of the forgetting curve and spaced retrieval practice.

Screen Shot 2014-05-11 at 14.15.13

Maybe we can also profit by adding in Kahneman’s model of Thinking, Fast and Slow: we need the process of slow, deliberate System 2 thinking to allow us to build up the necessary schemas in long-term which will in term lead to better lightning fast, automatic System 1 thinking.

Now obviously this isn’t the complete picture. I’d be the first to object that there’s more to education than getting stuff in and making it stick. But hopefully no one will dispute that this is an essential consideration and one it behooves us to get right. Yes context matters. Of course motivation matters. Naturally no  one way of approaching a problem will work in every situation with every pupil. There is no magic formula for success. Expert teachers will always be required to make expert judgements about what might constitute the right level of difficulty for each pupil. But this is about our ability to make meaningful predictions about what might be effective for most pupils in most situations.

In summary, we can’t wait until students have mastered a subject before introducing difficulty; it’s the difficulty that leads to mastery. Cognitive load theory reminds us that pupils will struggle to solve complex problems with minimal guidance but the best way to build long term memory to over come the limitations of working memory might be to reduce classroom performance and ‘think hard about subject content’ in order to improve storage strength of the concepts needed to think like experts. If we want learning to be easy, we need to make it hard.

NB: I’m about a fifth of the way through the papers cited in Soderstrom and Bjork’s literature review and am in danger of being swamped by cognitive bias. I would be extremely grateful for any research which seems to contradict any of the points made above.