Sadly, I missed most of the Friday. I spent the morning speaking at a maths conference (I know, right?) on correcting the mistakes made in the name of ‘numeracy across the curriculum’. If you’re interested, I argued that whilst numeracy has a pretty superficial connection with much that goes on in other subjects, mathematical thinking would be a far more powerful way to explicitly teach pupils to filter how they viewed the curriculum. I may blog on this at some point in the future.
Then, channeling the spirit of the John Cleese film Clockwise I had to race across to Wellington in a 1979 VW Campervan. I arrived for my debate with the legendary Dylan Wiliam with about 5 minutes to spare and sweating profusely.
For those who don’t know, I wrote a post earlier in the year in which I set out my ideas on why AfL might be wrong. Generously, Dylan took the time to set out a defence and we subsequently agreed to thrash out our differences in front of a live audience. Not having met Dylan before I wasn’t sure what to expect. There is no doubt that he is a good deal wiser and better read than I – I suspect that if he’d wanted he could easily have made me look foolish. However, I found him as generous in person as he has been online.
This is my position: The ‘big idea’ of formative assessment, that teachers should “Use evidence about learning to adapt teaching and learning to meet student needs” is wrong. It assumes that you can assess what pupils have learned in an individual lesson, and then adjust future teaching based on this information. But “If nothing has changed in long-term memory, nothing has been learned.”
One of the biggest mistakes that I think we’ve made in teaching is that we can see learning. But, as Robert Bjork tells us, learning is distinct from performance. We cannot see learning because it takes places inside children’s heads; we can only see what they do. If we measure pupils’ performance we can only ever infer what might have been learned, but such inferences are highly problematic for two reasons. Firstly, performance in the classroom is highly dependent on the cues and stimuli provided by the teacher and secondly, performance is a very poor indicator of how well pupils might retain or be able to transfer knowledge or skills.
Dylan has suggested 5 ‘key strategies’ that are required to embed formative assessment. They are:
- Clarifying, sharing, and understanding learning intentions and success criteria
- Eliciting evidence of learners’ achievement
- Providing feedback that moves learning forward
- Activating students as instructional resources for one another
- Activating students as owners of their own learning
Broadly speaking I don’t really have a problem with numbers 1, 3 or 5, but 2 and 4 are more problematic.
The idea that we should spend lesson time on eliciting evidence of learners’ achievement misses the fact that this achievement is only evidence of what they can do in a particular lesson and in no way allows us to work out what they will still be able to do in future lessons because as learning occurs so does forgetting.
Dylan did point out that much of the research that cognitive psychologists conduct is carried out of psychology undergraduates and that we need to be cautious about implementing these findings wholesale in classrooms as children may well behave differently. But I’m with Carl Weiman on this: research conducted in laboratory conditions produces more empirically robust data. And whilst this data may be at odds with the reality of classrooms it will still allow us to make meaningful and measurable predictions on how pupils are likely to behave.
The eliciting that Dylan recommends is basically asking questions. The only useful information we can get from doing this when pupils don’t know the answer; this at least affords us an opportunity to do something about it. But if they answer our questions correctly, it means very little. Just because they know it now, doesn’t mean they’ll know it next lesson or in an exam. A correct answer to a question is, perhaps, the least useful student response we can hope for. Dylan pitched in with the observation that failing to answer a question correctly is more likely to lead to pupils retaining information than when they give a correct answer. My best advice would be to assume that what pupils appear to know at the end of a lesson provides little indication of what they will know next lesson.
So what is the point of asking questions? If questions make you think hard and grapple with difficult concepts then there’s a reasonable chance they’ll result in learning. But if they’re just used to capture evidence on what they’ve just been taught which we then use to “adapt teaching and learning to meet student needs”, they are, I think, largely pointless. If however they’re used to assess what you have retained from a previous lesson, then they are very useful for helping pupils remember the learning we determine to be important.
I hope I’m not caricaturing Dylan’s position to say that as far as I could ascertain, he agreed with me. The nearest we can to a disagreement was on the idea that formative assessment would allow teachers to develop a ‘nose for quality’. (The originator of this phrase, Guy Claxton, was in the audience, but sadly I didn’t get the opportunity to discuss any of this with him.) My contention is that our instincts are terrible. Most people only have a ‘nose’ for what they like. We do what we do because we like it and this has very little to do with the ‘quality’ of whatever we’re doing. My experience suggests that learning is deeply counter-intuitive and relying on an innate sense of ‘what’s right’ will lead us astray.
My other concern is the idea that activating students as instructional resources for one another is likely to result in learning. For me, the only argument for investing time on peer assessment is Nuthall’s observation that regardless of how teachers organise their classrooms, children will find opportunities to talk to each other, and most of the talk will go unnoticed by the teacher. In fact he was able to work out that about 80% of the feedback pupils get on their work is from each other. This might be cause for celebration except for the fact that about 80% of this feedback is wrong. If that’s true (and I think it might well be) it is probably worth our while to attempt to do something about it.
Here are some other things we agreed on:
- There’s no substitute for knowledgeable, authoritative teacher talk and that denying this to children does not serve them well.
- As a species we are terrible at self-assessment. Dylan gave the example that 85% of drivers in the US rate themselves as having above average driving ability. Similarly, try asking teachers how many believe they are better than average.
- Difficulty is only desirable when it’s not too difficult.
- Relying on ‘what kids like’ is a mistake.
- Learning is most likely to occur when pupils are made to think hard, and that this is not necessarily something they will enjoy. If we only every do what kids like we will, inevitably, be dumbing down. I’d go further than this and suggest that the ideal state for a pupil to leave a lesson is one of struggle. If they understood that might be because they were asked to do something easy or produce in them the ‘illusion of knowing’. But if they get used to being confronted with troubling knowledge and difficult concepts they will leave thinking. And if learning happens when we think hard, and we remember what we think about, they are more likely to learn.
The reaction on Twitter was largely positive; although a few commentators seemed disappointed there was so much consensus. Maybe they were hoping we would strip to waste and wrestle? I have to admit I’m grateful we didn’t; for a man 15 years my senior, Dylan has a formidable physique.
All in all, I enjoyed the experience immensely and have a more complete understanding as a result of the discussion. We had a very civilised chat afterwards and I can say with complete sincerity that he is a thoroughly good egg.
The only other session I managed to attend on the Friday was Michael Gove’s finale in which he told us he “loved teachers” and admitted that policy was a matter of opinion. Love him or hate him, the one thing you can say about Gove is that he seems to sincerely believe in what he’s doing.
More on Day 2 soon…
Here’s an alternative reading of events from @MisterBHayes
Really useful summary for those not able to attend the debate. Making hard learning possible is one of the benefits of some AfL techniques, but actually diminished by others.
So AfL really is a catchall phrase for all formative feedback – and by the nature of comprehencive lists bound by the Curate’s egg principle.
To my mind you’ve got to start with a curriculum question that awakens interest and rewards effort…
I have always found Dylan Wiliam to be a great guy. I saw him talk once and I have never known anyone make more sense. I have sent him the odd email over the years and he has always responded generously and patiently. He gives the impression of someone who is genuinely and thoroughly interested in what he does.
I wish I could have been there for your discussion.
I think that the idea of formative assessment has suffered in schools. Checklists are powerful but formative assessment makes a poor checklist and this has perverted its aims.
I share your concerns about the typical AfL-type lesson which goes something like: Ask them questions and find that they don’t know, teach them, ask them questions and find that now they do know. However, I am not quite as sceptical as you are about performance as a proxy for learning.
In my own practice, I tend to us mini-whiteboards but either at the start of something – to find out what students already know – or maybe at some distance from the teaching e.g. a few lessons later. With my exam classes, I have developed the habit of giving the students a past-exam question about 2-3 weeks after teaching the relevant content. This then tells me something about retention.
In the sciences there is also the question of misconceptions. This is where constructivism is essentially right – students come into class with preconceived ideas and these need to be dealt with to some extent. This doesn’t require you to surface all of them. We know what a lot of these misconceptions are and we can programme our teaching to deal with them. But the success of programmes like CASE also shows that conscious discussion around these ideas – which, yes, leads to difficult thinking – may be able to enhance learning. Turnfordblog wrote a fascinating recent piece on this issue.
For instance, I will always ask students things like ‘what pushes a satellite around the Earth?’ or I draw a diagram of the Earth with a stickman standing at the north pole and one at somewhere approximating Australia. I then place a ball in each stickman’s hand and ask the students to draw the path it would take if the man let go of the ball. Students who can answer that gravity pulls objects towards the centre of the Earth will still draw the Australian ball falling down the page rather than to the centre of the Earth. Does this mean that they know nothing or didn’t understand the original idea? No. It just means that their conceptions are new, fragile and inflexible. Asking such questions prompts them to apply their ideas in new contexts and clarify their thinking.
You say nothing with which I would disagree. I don’t have a problem with performance as a proxy for learning; it’s the ONLY proxy there is. I just don’t see the point of using performance in the lesson you are currently teaching as a guide for what pupils will know next lesson.
I would definitely agree with that.
Which is kinda how the debate went 😉
Someone once said, “When you hear two different eyewitness accounts of the same car accident, you begin to worry about history”. However, my recollection of the conversation with David, and of our responses to questions from the audience, is broadly in line with his. My disappointment with the discussion was that we didn’t have more time to get into the issues in greater detail, as far as I can tell, there is no word-limit on this blog, so here goes…
David is certainly right that it is much easier to tell when something has not been learned than when it has. If I see that students still have the same misconceptions at the end of the lesson that they did at the beginning, I know my teaching has not been successful, at least at that time. That said, some of the Strategies and Errors in Secondary Mathematics (SESM) research found that students failed to show mastery on immediate post-test, but yet did do so two or three weeks later, even though they had received no further teaching on the topic. Some new learning just appears to get lost, while other bits of learning appear to be drawn into and integrated with existing learning over time, and what seems to be crucial in determining which of these two things happens is the number of connections that are made to existing knowledge.
I would of course concede that whether a student can do something in one lesson is not a very good guide to whether they can do the same thing some time later, but if we see if students can apply what they have learned in a different context, then I think our conclusions about what is happening will be more robust—not perfect, but more robust. Also, it is worth pointing out that for me, questioning is just one particular way of finding out where students are in their learning. Making statements, and expecting students to respond to the statements, are often more powerful, for two reasons. One is that you can’t be wrong in responding to a statement, whereas you can be wrong answering a question. The second reason is that if a student chooses to use an idea without specifically being prompted to do so, it would suggest that the student has begun to integrate the new idea into their thinking, which is also, presumably, assisted by the conversation with the teacher. Getting emotionally engaged with (i.e., excited about) something also improves long-term retention, which is why peer discussion can be powerful, if students are talking about the right things.
Where I think David and I do disagree is about the relationship between performance and learning (in Robert Bjork’s terms). We both agree that performance in a learning task is a poor guide to the learning that is taking place, but it seems to me that David may be saying that performance in a learning task is of absolutely no use in determining learning (as evidenced, say, by performance at some point in the future). This seems to me to be an empirical question. I am arguing that seeing students apply what they have just learned in a different context in a lesson means that they are more likely to be able to show learning in the future (especially if the new context is far removed from the context used for teaching) while, if I have what David is saying correctly, he is saying that we still know nothing about the likelihood of students showing learning at some point in the future. Since this is an empirical question, we can do some experiments, but I think the odds are on my side, since David’s position (or at least my characterization of his position) is rather extreme, in that he is claiming no relationship between performance and learning, whereas all I am claiming is that the learning and performance are not entirely independent of each other.
The only other point in David’s account on which I feel the need to comment is that I didn’t explain clearly enough Guy Claxton’s idea of a “nose for quality”. David is right that on a whole range of matters, our judgements are untrustworthy, and we are led into error. In particular, we underestimate the role that chance factors play in our successes, which is why asking successful people about the reasons for their success is such a terrible idea (“The success equation” by Daniel Mauboussin is excellent on this point). However, Guy Claxton’s idea of “a nose for quality” was intended in a different sense, and this was related to our perception of quality in student work. Some people have argued for providing students with criteria for success, and of course, if we can tell students what to do to make their work good, then we should do so. The point that Guy Claxton was making was that on many occasions, we cannot write down definitions of quality, but we can sometimes, as a group of teachers, agree about whether particular examples of student work are good or not. The important point is that if we do agree on what is good and not so good work, we have the basis of a workable assessment system even if we cannot agree on why we think the work is good. And perhaps even more importantly, we can then begin to enculturate our students into the same “community of practice”.
Thanks for the clarification. Disappointingly I think I tend to agree with your conclusions about performance as stated here: it certainly seems sensible to suggest that if students can apply concepts in different contexts it will be a pretty good indication that they are likely to retain what they have learned. Of course I don’t think that there’s no relationship between learning and performance – just that the relationship is much less straightforward than most teachers suppose. As far as empirical evidence goes there’s a wealth of research into the relationship between learning and performance – conveniently it has been compiled into a literature review by Nicholas Soderstom: http://bjorklab.psych.ucla.edu/pubs/Soderstrom_Bjork_Learning_versus_Performance.pdf
This ‘nose for quality’ business is almost exactly the case that Royce Sadler makes about rubrics.
[…] Of course, that is only a very brief overview of a discussion which lasted for half an hour, but there’s still plenty to think about there. However, if you would like to follow up on this, you can see David’s own reflections on the Festival here. […]
[…] Part 1, in which I recap my debate with Dylan Wiliam, is here. […]
[…] Sadly, I missed most of the Friday. I spent the morning speaking at a maths conference (I know, right?) on correcting the mistakes made in the name of ‘numeracy across the curriculum’. If you’re interested, I argued that whilst numeracy has a pretty superficial connection with much that goes on in other subjects, mathematical thinking […]
David, I disagree with your comment about ‘4.Activating students as instructional resources for one another’ as I think (and I have some original research into the matter) that it depends on how the collaborative learning is structured. I have found that there is a statistically significant improvement in the ‘retention of knowledge’ in classes where I use collarboative learning as opposed to normal chalk and talk and drill and kill. Again, a lot of ‘group work’ is a complete and unconditional waste of time, but properly structued it is a fantastic tool.
Having been observing this debate as it has developed, and having stuck a very small oar in earlier, and having also been at your session at Wellington, I am deeply impressed with the way you, and Dylan, have arrived pretty much at a consensus, particularly since I find myself agreeing with it. One up for robust but restrained, intelligent debate, I think. I am pretty sure that my trainees next year will get a better session on assessment on the back of this. Thank you.
I find ‘learning intentions’ so artificial I am embarrassed when I say the words. This is especially true in maths where I tell children the learning intention for adding fractions, etc. It strikes me as totally pointless.
To sum up:
DW: you should use what the children learnt in the lesson to determine your planning for the next.
DD: but learning is invisible!
DW: I agree.
TEACHER: so what do I do?
DW: fail! All teachers fail on a daily basis!