The cost of bad data is the illusion of knowledge – Stephen Hawking

What’s more likely to kill you? A shark or a hot water tap? We’ve all heard stories of killer sharks, but as yet Spielberg hasn’t made a thriller about killer plumbing. We reason based on the information most readily available to us. We assume that the risk of dying in a plane crash is greater than the risk of dying on our sofa because plane crashes are so much more dramatic. But we’re wrong.

This is the availability bias. We make decisions based on the most readily available information in the belief that because it’s readily available it’s more likely to be accurate. Sometimes the information we can draw to mind might be accurate, but sometimes it’s not. Ignorance isn’t bliss; it’s scary. We are often most terrified by the unknown. But if something feels familiar, no matter how bad it is, we can cope. We prefer erroneous information to no information at all.

So in order to feel like we know what we’re doing, we surround ourselves with data. But although you can’t have information without data, you can most definitely have data without information. Data is uniquely comforting because it’s just so quantifiable. If you can turn something into a percentage or a bar graph it must be objectively true. The problem is, it’s remarkably easy to make up data. This leads us to do all kinds of foolish things in schools.

Consider this entirely fictitious scenario: a school leadership group is considering moving away from lesson grading in the light of a landslide of disconfirming evidence. They accept that lesson grading is invalid and unreliable and that taking a lesson study approach is more likely to support the professional development of teachers. But, and it’s a big but, what about Ofsted? Those pesky inspectors are expecting to see a neat spreadsheet which shows the percentage breakdown of teaching which is outstanding, good and requiring improvement. How will they react if this data is not on hand? How can we be accountable without numbers? And that’s the problem.

If we accept the findings of the MET Project which demonstrates that if a lesson is observed by two observers they will give very different grades. If one observer gave a top grade, there would be likelihood of about 70% that the second observer would give a different grade. And if one observer gave a bottom grade, the likelihood that the second observer would give a different grade was almost 90%! Learning is invisible. Any judgement made in the class room about how pupils are learning is guesswork at best. Any attempt to turn this information into data is witchcraft.

But it’s soooo comforting. Many school leaders have been seduced into the easy certainties of grading lesson observations, aggregate the grades, and then proudly declare that teaching in their school is 80% good or better. But this is meaningless. Assigning numerical values to our preferences and biases gives them the power of data, but they’re still just made up.

Hunt’s illusion of levels

This is one of the many reasons why I’m delighted to see the back of National Curriculum levels. Tristram Hunt inexplicably referred to this decision at Policy Exchange’s recent conference What should the political parties promise on education in 2015? as a “spectacular own goal”. When challenged on this he said something along the lines of, ‘Well, when I’ve spoken to teachers that’s what they’ve told me.’

But NC levels are made-up data at their worst. Someone (often a teacher) is asked to assign a numerical value to students’ work on a regular basis. We then pore over these tea leaves as if they are an objective reality instead of someone’s best guess about how a student may have performed in a particular task on a particular day. And then we write reports saying with absolute certainty that ‘Emily is a 4b in writing’ and ‘Isaac is a 5c in maths’. But what does this actually mean? What can we know about what Emily or Isaac can actually do? These numbers provide the illusion of knowledge. And on this illusory foundation we build a house of cards with which to hold schools and teachers to account.

Always remember, target grades are made up.

We fall into the same bear pits when setting students targets. We tell them they need to know their target grades as if they are cast iron certainties. But while they may not be simply plucked from the air like lesson observation grades, they’re based on statistical probabilities that may have some validity when applied to large cohorts but which are reduced to meaningless nonsense when applied to individuals. I’ve blogged before about how pernicious this practice is. Possibly, the most useful thing we can do is to subvert these targets to harness the power of the growth mindset.

But no data is bad in and of itself. Just as guns don’t kill people, data doesn’t distort the curriculum or warp decisions about what to teach: we do that. We are comforted by the illusion of knowing. But we really don’t know. And any accountability system that allows people to either input numbers they’ve made up or extrapolate data which is wrestled into meaning something it was never intended to mean is doomed to fail. But what’s worse is that many schools, teachers, parents and children aren’t even aware of the failure.

So what can we do?

Next time someone shows you a spreadsheet, try asking the following questions:

  1. If this data is the solution, what’s the problem?
  2. Is there a different way of interpreting the data?
  3. How can I verify the quality of the data I’m being shown, and what are its margins for error?
  4. What are the limitations of this data – what doesn’t it show?
  5. How is this data likely to affect my decision making? What would I do differently if I didn’t have this data?

In this way maybe, just maybe, we can avoid some of the potential pitfalls associated with availability bias.

Further reading

Jack Marwood’s Using Data Properly: Ditch the Cargo Cult Data for Actual Data