Evidence and disadvantage: How useful is the EEF Toolkit?

//Evidence and disadvantage: How useful is the EEF Toolkit?

Although everyone’s education is important, the education of disadvantaged students is, arguably, of much greater importance than that of students from more advantaged backgrounds. The more privileged your background, the less it’s likely to matter what happens at school. Conversely, the more socially disadvantaged your background, the greater the impact of what does, or does not happen at school.Sadly though, access to education is more than likely to experience a Matthew effect. Those who have the best chance in life are the most likely to get a great education. That being the case, it seems reasonable to suggest that whilst all children deserve that the decision taken by teachers and school leaders is rooted in the evidence of what’s most likely to lead to increases in academic progress, this is more urgent for the ‘have nots’.

With this is in mind I read John Tomsett’s latest blog on how best to support disadvantaged students with great interest. I always learn a lot from reading John’s thoughts, even if I occasionally disagree. John cites the Sutton Trust Report, Improving the impact of teachers on student achievement, which makes this statement:

The effects of high-quality teaching are especially significant for pupils from disadvantaged backgrounds: over a school year, these pupils gain 1.5 years’ worth of learning with very effective teachers, compared with 0.5 years with poorly performing teachers. In other words, for poor pupils the difference between a good teacher and a bad teacher is a whole year’s learning.

Powerful stuff.

He then quotes Sir Kevan Collins, CEO of the Education Endowment Foundation as saying, “If you’re not using evidence, you must be using prejudice…” I found myself feeling rather envious of this fabulously pithy aphorism, but then reflected that it might be more ‘truthy‘ than true. Of course he’s entirely correct to say that prejudice is the opposite of evidence, but then almost everyone justifies their position with some sort of evidence.

There are two main issues here. Firstly, not all evidence is equal. It’s no good ‘using evidence’, the question should always be, what evidence? As ED Hirsch Jr. has said,

Almost every educational practice that has ever been pursued has been supported with data by somebody. I don’t know a single failed policy, ranging from the naturalistic teaching of reading, to the open classroom, to the teaching of abstract set-theory in thrid-grade math class that hasn’t been research-based. Experts have advocated almost every conceivable practice short of inflicting permanent bodily harm.

The EEF’s Pupil Premium Toolkit has attempted to summarise the research around various interventions o which school leaders might be tempted to lavish time and resources, and this has probably helped the education to become more evidence informed. However, there are some real issues with the picture they present. First, it relies to heavily on aggregating meta analyses without accounting for the real problems with taking such an approach. I’ve outlined my reservations here. Second, the picture is partial. Some of the most robust, well-replicated findings from cognitive psychology (like the spacing effect and retrieval practice) are entirely absent. For schools looking to increase the chances of their most disadvantaged students there a few more productive avenues to explore. Feedback – the most highly rated intervention in the toolkit – will, we’re told result in +8 months’ worth of progress per year of instruction. But compared with what? Not giving feedback? I’m sure I’m not alone when I say I have never encounter a teacher who does not give feedback! I’ve explored further troubling issues with the assumption that feedback is always positive here.

Worse, the Toolkit offers succour for those who think Learning Styles might be a good bet, suggesting it’s likely to result in an additional two months worth of progress for every year’s worth of instruction. If you delve a little deeper, you’ll find that actually this is based on a median effect size of 0.13 – well bellow Hattie’s hinge point of 0.4. And further, we see that this effect size in only as high as 0.13 because of the findings of an unpublished piece of PhD research!

This is an embarrassment! Surely we could come up with a more sensible approach to informing teachers about the best way to support disadvantaged children?

The second point about ‘using evidence vs using prejudice’ is that I know for a fact that there are very many people who, given the same source of evidence as me, will arrive at entirely different conclusions. The problem isn’t either using evidence or using prejudice, it’s that everyone interprets evidence according to their prejudices. I’ve outlined the problems with cognitive bias here, but for a more extensive analysis, you might find my book What If Everything You Knew About Education Was Wrong? useful.

Those at the EEF are as prone to prejudice as anyone. Take the example of the decision to fund further research in Philosophy for Children. There are serious misgivings with the way the trials so far have been run and Greg Ashman makes the point that some at the EEF are overly attached to the perceived benefits of ‘meta-cognitive strategies’.

This is an excessively broad category of teaching interventions aimed at increasing thinking about learning. Reading comprehension tricks fall into this group and such strategies seem ripe for expectation effects (e.g. the placebo effect).

Perhaps this has led to an unconscious bias at EEF headquarters. During [Jonathan] Sharples’ presentation he suggested that the results from trials of meta-cognitive strategies were consistently positive with a similar effect size. Yet he seems to have forgotten about the recent EEF trial of cognitive acceleration, a meta-cognitive approach to science lessons. This trial found no effect for cognitive acceleration.

Of course I agree that evidence is vital in our efforts to transform the life chances of disadvantaged children but I’m not sure we can confidently conclude, as John does, that “we have to improve the quality of teaching in our schools: it is the only thing that matters.” Really? The only thing? Doesn’t it matter what’s being taught and how it’s being assessed? What about the quality of a school’s behaviour systems and pastoral care? Obviously the quality of teaching in schools is crucial, but what’s the point in teachers doing a fabulous job of teaching something rubbish? I’ve argued before that what we teach trumps how we teach. Charitably, we should probably conclude that by ‘quality of teaching’ John means to include these other things, but there will inevitably be some readers who conclude otherwise.

Maybe this suggests we need more or better evidence about what might constitute the best curriculum provision for disadvantaged children. But what I do know is that currently the EEF is as much part of the problem as it is part of the solution. It’s worth considering whether decisions about the education of the most disadvantaged is too important to leave to the prejudices of an ideologically driven, unaccountable clearinghouse who decide both what research to fund and what make available as part of a Toolkit.

John includes a quotation from the economist, Thomas Sowell’s book, The Vision of the Anointed: Self-Congratulation as a Basis for Social Policy“It is so easy to be wrong-and to persist in being wrong-when the costs of being wrong are paid by others.” Quite so.

2017-02-27T09:01:15+00:00February 26th, 2017|research|


  1. Tom Burkard February 26, 2017 at 6:07 pm - Reply

    Great post–but I take issue with you on one point: I think our most-able pupils are being failed just as much as the disadvantaged and less-able. In terms of winning life’s race, of course they fare better. But a knowledge-lite curriculum affects everyone. And like it or not, our society depends more upon its most-able members.

    • David Didau February 26, 2017 at 7:38 pm - Reply

      Arte you conflating ‘most able’ with ‘most advantaged’?

  2. Grumpywearymathsteacher February 26, 2017 at 6:14 pm - Reply

    From what I’ve seen of philosophy for children, it looks as if it could increase general knowledge, vocabulary, cultural capital, confidence. Can’t be that bad, surely?
    (I know you were complaining about the research, not the thing, but…)
    As for measuring benefits using SAT scores, I have little confidence in the current Y6 SAT tests, having seen the scores of my Y7 Maths class, and then met them. Depending on how vigorously their different primaries crammed them, they may have a great score and poor understanding of the basics, or vice versa.
    When you are judging an essay using Comparative Judgement, you are kind of measuring it using your gut feeling for its quality? (I don’t mean that in a bad way, you just _know_ because of your extensive knowledge and experience.) I think maybe when Sue Cowley and others say they _know_ that certain kinds of activity are beneficial, perhaps they know this in the same way, from long and deep experience? (And I have read what you’ve written about the fallibility of humans’ judgements of, well, everything, but you have to make qualitative judgements sometimes, surely?)

    • David Didau February 26, 2017 at 7:47 pm - Reply

      I struggle to accept that P4C *could* have the results claimed for it. The mechanisms are broadly similar to those of Let’s Think… which I critiqued here https://www.learningspy.co.uk/research/thinking-skills-can-teach/

      Your point about SATs – or any tests being a fair measurement is trickier. You have to have *some* way to measure impact of *something*. That would you prefer? I wrote about this here https://www.learningspy.co.uk/research/can-learning-summed-test-scores/

      The reason CJ works because it aggregates tacit knowledge. It’s by no means perfect, just a lot better than a teacher (or anyone) trying to apply a set of standards to children’s writing.

      From what I’ve seen, little of what Sue Cowley says or recommends is based on anything beyond what she would like to be true. This is not a sound basis for reliable judgment.

      • suecowley February 27, 2017 at 11:04 am - Reply

        Since you mention me by name again David, are you able to be specific about what I have said or recommended that you feel is unreliable? For instance, my comments on the value of play in EYFS are based on my twenty five years in education and on my experience of helping our early years setting move from a requires improvement Ofsted grade to an outstanding one.

        • David Didau February 27, 2017 at 11:46 am - Reply

          Hi Sue – lovely to see you popping up in my blog comments again. It’s been too long 🙂 As you can see from the thread, I responded to a comment in which you were mentioned by name and so felt I had to refer to you explicitly. I figured you might prefer this to being ignored.

          I honestly haven’t read much of what you write beyond some of the mischievous “just asking questions” tweets which you seem to enjoy so much. I’m therefore unable to comment in any detail on specifics, but on the whole I have little confidence in the value of intuition as a basis for determining policy. And Ofsted grades are a terribly poor proxy for determining if your prejudices are worthwhile. All this tells us is that you share the same prejudices.

          That said, as you can see in this blog https://www.learningspy.co.uk/psychology/can-learn-evolutionary-psychology/ you’ll be pleased to see I think your intuitions on play are not entirely off the mark.

          Thanks for reading 🙂

        • Michael Pye February 27, 2017 at 2:03 pm - Reply

          Your first question is reasonable Sue (your obviously annoyed but this is reasonable considering Davids comment) but the second part where you argue that your experience constitutes the evidence you need seems to have missed the ongoing point he makes..

          It is quite possible even likely that our experience compensates for flaws in our teaching philosophies or that our prejudices and cognitive basis limit our ability to assess the effectiveness of our own practice.

      • Grumpywearymathsteacher February 27, 2017 at 9:20 pm - Reply

        The old Y9 Maths SATs were thoughtfully designed to assess understanding. I don’t think you shouldn’t use tests, just don’t think the current Y6 ones are any good.
        Sorry if I’m being slow here, but what’s the difference between your gut feeling (‘tacit knowledge’) on a child’s writing and that of an Early Years expert on a younger child’s language level, for example? I know you have more than one person at a time for CJ but they tend to agree a lot, don’t they?
        I don’t think you can just dismiss what Sue C says because she doesn’t sound all scientific. From looking at her blog, it seems that in her preschool they are providing the kind of experiences that the most advantaged kids get at home from their middle class intellectual parents – and we know those kids have an advantage when they start school…I think perhaps if all children had this kind of start, we might not have to do so much direct explicit instruction in secondary school (yes, you can shoot me now!) because they would be further along the road from ‘novice’ to ‘expert’ ?

        • David Didau February 27, 2017 at 9:36 pm - Reply

          The difference between 1 person’s intuition and an aggregate of 5 x the number judgements per scripts is *much* higher reliability.

          It’s not personal, but I’m dismissive of anyone who routinely ignores or derides the need for evidence and research. Your point about not needing explicit instruction if Sue gets her way in the early years is based on a common misunderstanding of how human beings have evolved to learn. This post explains why: https://www.learningspy.co.uk/psychology/can-learn-evolutionary-psychology/

          • Grumpywearymathsteacher February 27, 2017 at 10:00 pm

            I didn’t mean no explicit instruction at all, but I do think that if everyone were as lucky as those of us who were immersed in language during our early years (being spoken to, read to, taught the names for things, given lots of stimulation and varied experiences…) we would have hardly any students in secondary who struggled to speak and write in sentences.

          • David Didau February 28, 2017 at 1:50 pm

            It’s a fallacy to believe that a rich spoken language environment automatically confers the ability to write in academic English, but I agree that a language rich start is highly beneficial. However, the need for explicit instruction to teach biologically secondary knowledge would remain the same.

    • teachwell February 27, 2017 at 10:05 am - Reply

      Have you taught P4C? Far from supporting cultural capital – the main focus is for chidren to repeat their own ill thought out ideas over what are utterly vacuous topic areas they can “relate” to. I would never use it again and would just teach actual philosophy.

  3. Warren February 26, 2017 at 11:39 pm - Reply

    Is it not the case that if the educator is passionate about the topic/subject/approach then we can expect to see a positive result in some learners, with those they teach? Whether they reach Hattie’s 0.4 or not. Same subject with different educators and learners are served rubbish. If the subject matter is questionable (rubbish), are there benefits in learning differently to develop our ability to learn differently. Not everything we learn is needed in our lives or assessable.

    • David Didau February 26, 2017 at 11:47 pm - Reply

      If it’s true that some things we need to learn are not assessable, then how would we know if we’d learned them?

      The only reasons I can think of for why something might not show up on an assessment designed to look for it are either a) it has been performed or b) it doesn’t exist.

      And what do you mean by developing “our ability to learn differently”? We learn by processing items in working memory and storing them in long-term memory. That’s all there is; there’s no other way for us learn.

  4. Derek Hopper February 27, 2017 at 5:06 am - Reply

    Good on you for holding the Education Endowment Foundation to account – so important to dig a little deeper when reading claims about research and not take it all at face value.

  5. tonyparkin February 27, 2017 at 9:17 am - Reply

    “If it’s true that some things we need to learn are not assessable, then how would we know if we’d learned them?”.
    I was at an interesting session on assessment and assessment practice at the RSA last Friday, offered by Prof Geoff Masters, from ACER, Australia. He recounted this as a challenge set to him by an Australian Catholic school who valued ‘care, conscience and compassion’. They wanted help with assessment – but recognised that they had no methodology or quantifiable criteria for assessment.
    So the key question is, should that school give up on its mission statement to instill ‘care, conscience and compassion’ merely because they lacked a suitable methodology or way of assessing their effectiveness at doing so? Should we only value that which we can measure?

    • David Didau February 27, 2017 at 10:15 am - Reply

      “Should we only value that which we can measure?” No, of course not. I value all sorts of things I have no ability to measure. But that’s an unhelpful way to tackle you key question.

      And anyway, I said ‘assessable’ not ‘measurable’. Not all assessments have to have a metric. We can, for some things, simply say whether or not they have been demonstrated. For others, we can look at two examples and say which we think best demonstrates the stuff we’re trying to assess.

      The trouble is, if the thing you value doesn’t show up on an assessment, maybe it doesn’t exist. This is true, I believe, of such domain general ‘skills’ as ‘creativity’ and ‘critical thinking’. What shows up is something domain-specific which we then falsely claim to be domain general without seeing whether what we’ve found generalises.

      If, like the school you mention, we value ‘care, conscience and compassion’ then it stands to reason that we should explicitly model what these things look like in the hope of impressing students with the worth of our values. If they are modelled well, maybe children will share our values. This is fine.

      What what, in my view, be a mistake would be to teach a ‘care, conscience and compassion’ curriculum. If you can’t measure the impact of this teaching then how can you know whether or not it’s having any effect? Indeed, how can you know if it’s having a negative effect? For all you knew you might inadvertently be making children unkinder and less caring.

      • tonyparkin February 27, 2017 at 11:21 am - Reply

        Ah… but how do you know if modelling ‘care, conscience and compassion’ is having a positive or negative effect? Yet you have faith that this is fine, but not in teaching these?
        So presumably modelling creativity and critical thinking in school in fine, as long as they are not included in formal assessment in the curriculum? Possibly you believe that it is impossible to identify and assess creativity, or critical thinking? Or are you merely saying that any assessments are so subjective as to be unreliable as evidence?
        Or are you just objecting to any claim that these are generalisable skills, and are happy to see them explicitly taught taught in a specific context? Presumably also refuting the teaching of chess, and philosophy(?), as ways of enhancing critical thinking and creativity?
        Messy, isn’t it….? 🙂

        • David Didau February 27, 2017 at 11:54 am - Reply

          You don’t know. I have very little faith in anything, I’m afraid. I’d simply prefer modelling over instruction as the opportunity cost is much less.

          But, no – it’s really isn’t all that messy:

          1. Modelling creativity & critical thinking *within a domain* is a good idea – crucial in fact if you want students to get good at them. They should definitely be included in formal instruction. *Domain general* skills don’t exist so can’t be modelled.
          2. It’s relatively straightforward to identify and assess creativity & critical thinking *within a domain*
          3. All assessment is subjective to a degree but there are ways we can aggregate such subjectivity to in order to make more reliable inferences.
          4. I’m not just happy to see domain specific skills taught, I think it’s essential to do so.
          5. Yes, teaching chess just makes you better at chess. This is one of the most well researched domains and there is a very clear consensus about that. I also refute teaching philosophy as a way to improve domain general skills *because such skills do not exist.* The recent EEF trial on P4C is, I think, laughable.

  6. Fred Flint February 27, 2017 at 12:16 pm - Reply

    Thanks for this email. It chimes with a lot I have been thinking about and it’s good to see these mega-synthesis arguments being brought to task.

    I came across a paper by two of the big people in meta-analysis, Cheung and Slavin (https://tinyurl.com/h7wr6zc) which notes the relationship between methods used in research and the ‘effect sizes’ which you get. It suggests that you really can’t compare between different studies with different methods. Simpson (https://tinyurl.com/zhzbnwx) goes further and takes the whole enterprise of ‘effect size’ to task. In particular, Simpson says the same things as you about feedback: he uses it as an example of how you can’t compare studies: a lot of the feedback studies find the effect size of feedback at the level of ‘correct/incorrect’ against a control group with no feedback, while some others are comparing formative feedback with summative feedback etc. The same for homework: some are homework vs no homework and some are homework vs alternative homework.

    Oh, and by the way, the EEF also make some pretty big mistakes: I looked up the Slemmer PhD you mention in your piece on learning styles – the PhD student gives the effect size as a correlation measure and the EEF take it to be a Cohen’s d: they are not the same thing at all, so their estimate is WAY out!

    Cheung and Slavin argue that we need to be much more careful about adjusting and comparing, but I think I’d go with Simpson’s point about not driving policy on the basis of ‘effect size’ at all – it just isn’t evidence of more or less effective educational interventions, it is evidence of more or less well conducted studies.

  7. Michael Rosen February 27, 2017 at 12:25 pm - Reply

    General observations: human beings spend quite a lot of time doing useless things. There really isn’t much point going to a football match, watching the Oscars, whistling, putting pictures on the wall of your home, taking photos of a tree. Almost any subject (or topics within a subject) we study at school can be given the once-over and shown to be intrinsically useless. We can start from extrinsic factors like: ‘we need bridges’ or ‘making popular music is commercially valid’ or ‘we need to know languages to communicate with people who don’t speak our own’ etc but nominally education has other objectives other than these purely extrinsic ones. Yes, these are blurry, grey, woolly and easy to deride.

    However, in the territory of what we value about the quality of life, and how we relate to each other, many of these blurry, grey and woolly matters come up. And there is a huge range of them: e.g. The government says ‘Faith schools good’. At faith schools, children are taught all sorts of things about an ineffable being. No amount of evaluation and assessment (showing that the teaching is not turning out believers, say) would discourage people who run faith schools from doing this. No government in the immediate present is going to alter that. In other words, the argument about what actually takes place in the schooling of thousands of children (the ones in faith schools) is going to be immune to arguments about whether teaching religion should be on the curriculum or not. (They might be interested in relative effectiveness of ‘lessons’, but that’s not really what’s going to affect the presence of the subject itself. )

    In my own field of the arts and literature, we are mostly governed by history, custom, practice and experience. And we know that it’s quite easy and possible to eliminate all arts practice from schools. We would make a lot of noise, but justifying it in terms of assessment criteria would hardly hold water with people who think that there just isn’t time for this stuff in a knowledge-based curriculum. I notice that some people justify it in terms of profitability – which may hold water with utilitarians, I guess.

    • David Didau February 27, 2017 at 2:18 pm - Reply

      We’ve been over this ground before and, broadly, I agree with you. That said, just because “human beings spend quite a lot of time doing useless things” shouldn’t be used as carte blanche for public money being spent on “useless things”. Your point about the ‘blurry, grey’ purposes of education which many hold dear are – I think – a big part of the problem. My stated purpose remains: to make children cleverer. This is meaningful, measurable and achievable. The fact that some people’s favourite purpose is none of these things ought to cause for alarm.

      I agree that faith schools are a nonsense. I think we’re always better without superstition and magic. But that’s no reason not to teach a history of religion. I’ve just finished Richard Holloway’s A Little History of Religion, and it’s excellent. All children would benefit from knowing this stuff.

      As for the art, as I’ve said before, the arts have a vital place in a knowledge-rich curriculum. My justification is that we think better if we have knowledge of the various arts. We may however disagree with how the arts ought to be taught. But let’s leave that for another time.


      • Michael Rosen February 28, 2017 at 8:59 am - Reply

        Re History of religion – absolutely – great subject. Various governments (or hand-picked committees reporting to government) have recommended that non-religious world beliefs e.g. humanism, should be taught as part of this subject, but in many schools this is ignored.

        How do we know that doing useless things doesn’t also make us cleverer? Old people are encouraged to do active useless things to stop their brains dying. (I paraphrase/exaggerate but you get my drift.) I know you don’t like personal anecdote but I have observed the way some of my children have poured a huge amount of energy into something ‘useless’ and become very ‘factoid’ and rote-learner-ish about it. It was if they rehearsed knowledge-acquisition through doing something useless like collecting football cards and testing each other on this ‘knowledge’. My brother the scientist did the same with the Le Mans 24-hour car race. Useless but he made it useful.

        • David Didau February 28, 2017 at 1:48 pm - Reply

          I think it’s very clear that so-called ‘useless’ things are actually very useful in constructing knowledge schemas. I side with AE Housman: All knowledge is precious whether or not it serves the slightest human use.

          • NEm February 28, 2017 at 5:50 pm

            Everybody’s world-view is based, in a large part, on faith; for many people it might not be religious faith but it is faith nonetheless. This is simply because it is very difficult to know everything there possibly is to know about the true state of the world and we have to fill in the large gaps based on our grasp of the facts as we understand them.

            It’s very easy to be critical of faith schools and even mock what they teach about an ‘ineffable being’. But the problem isn’t with faith schools per se, or even religious faith necessarily; It’s with the ‘truths’ that so many people hold without having examined them deeply at all. How about supposedly British values such as ‘democracy’ or the near-universally accepted idea that we have a free press that presents a wide-range of opinions? These ideas can be thoroughly debunked with relatively little effort. Incidentally, part of religious history would cover the movement away from magic and superstition to a concerns about justice – it would probably include the debt humanism has to the values celebrated in the religious traditions it grew in. All knowledge is indeed precious and in the case of religious knowledge, as in the case of democracy etc., it needs to be examined critically, without flinching, and respectfully without bending to people’s sensibilities.

            A rambling post…I’m sorry.

            I suppose, I’m making the point David often makes. We should just accept the received wisdom or what is most convenient for us to believe. That goes for religious, secular, progressive or traditional views on the world, or teaching, for that matter.

          • Michael pye February 28, 2017 at 8:28 pm

            Words that are the same don’t always have the same meeting in different contexts. Faith in a god is not the same as faith in a person or general idea. (At least not in most religions but I’ll come back to that).

            I myself am not religious having commited myself to atheism. But I was raised Catholic in my early life and the faith I was raised to belief in was distinctly different in its scope. I think in this case we should use two different words to distinguish between them.

            Many people with a more philosophical and non-denominational view might not see this distinction.

            Please forgive my focus on this point. I agree with where you are going with your argument.

          • Michael Rosen March 1, 2017 at 8:52 am

            Like it.

  8. […] few days ago I wrote about why we shouldn’t credulously accept evidence, and that it wasn’t as simple as suggesting that teachers either use evidence or prejudice to […]

  9. […] few days ago I wrote about why we shouldn’t credulously accept evidence, and that it wasn’t as simple as suggesting that teachers either use evidence or prejudice to […]

Constructive feedback is always appreciated

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: