Campbell’s Law: The more any quantitive social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to measure.
Goodhart’s Law: Any measure used for control is unreliable.
Metrics can be great. We can be so preoccupied at seeing only what’s right in front of us that it’s all too easy to miss anything peripheral. Having one eye on data helps us to think statistically instead of relying only on anecdote and heuristics.
It’s hard to argue that metrics haven’t led to major improvements in education from when I started teaching in the late 90s. Back in the day, nobody ask me to look at any kind of data and I had literally no idea of my students’ prior attainment and only the vaguest notion of how well they did in national exams. Consequently, lots of children did very badly and no one took any responsibility. Times changed. Education became increasingly focussed on gathering performance data and pressure was put on schools to make sure exam results went up. So they did. Then came concerns about grade inflation, gaming and malpractice.
Did the metrics help? Well, maybe. Perhaps they helped sort out some of the most egregious incompetence in the system, but then, as Goodhart’s Law kicked in, they became increasingly unreliable.
If you’re in any way concerned with measurement, accountability or assessment in education you really ought to read Jerry Muller’s wonderful book, The Tyranny of Metrics. In it Muller argues that while metrics can be useful, more often than not they do far more harm than good. Chapter 8 provides an apposite case study of the perverse incentives and unintended consequences of what Muller calls ‘metric fixation’ in schools.
Basically, metrics can be useful in a low stakes, high trust environment. Where stakes are high or trust is low, attempts to measure student outcomes result in, at best, gaming, and at worst cheating. Similarly, attempts to measure teacher effectiveness usually result in turning biases into numbers which, by turn end with anything from resentful compliance to unfair dismissal.
Many of these problems should be well known to regular readers of this blog. Some posts you might find interesting include:
- 5 questions to guard against availability bias and made up data
- Perverse incentives and how to counter them
- Big data is bad data
- The illusion of knowing
- When Assessment Fails
- Intelligent Accountability
Muller ends his book with a very useful checklist which I’ll summarise here:
1. What kind of information are you thinking of measuring?
When we try to measure anything that can be influenced by the process of measurement, reliability suffers. Teachers and students are self-conscious agents and, if rewards or sanctions are tied to the measurement process, behaviour will be distorted. That said, if teachers and students agree with and approve of what is being measured, the more likely they are to behave in ways that increase the validity of the measurement.
Tip: if you want to measure something inanimate, go for it. If you want to measure some aspect of human performance, gain consensus first and avoid tying your metric to rewards or punishments.
2. How useful is this information?
Just because you can measure something doesn’t mean you should. Ease of measurement tends to be inversely proportional to usefulness. Ask yourself, why do you want this information? What are you going to do with it?
Tip: Think carefully about how collecting data will improve the experience of students and teachers. If you’re not sure it will, don’t measure it.
3. How useful are more metrics?
Measures of human performance are most useful at revealing outliers. A good metric might reveal misconduct or ineptitude but may still do a very poor job when it comes to providing meaningful information about good or average performance. Most schools are small enough that school leaders already know where there is poor performance. Introducing a metric to measure what you already suspect wastes the time of the majority. Measure to the least degree that you need to.
Tip: Treating all teachers or students equally is fundamentally unfair. If you know someone’s doing a good job, don’t interfere. If you know someone’s struggling, invest your efforts in helping them improve.
4. What are the costs of not relying on standardised measurement?
Every headteacher I’ve ever spoken to has a pretty good idea of what’s going on in their school. If a teacher always seems to be missing from duty, if parents continually request a particular teacher does or does not teach their child, these are useful sources of data. Does it matter if these intangibles fail to show up on test scores? Maybe. If a teacher is well loved but gets poor results you might want to intervene, but you should probably intervene differently than you would for a teacher who gets poor results and who is roundly despised. Everyone loves a graph, but what is it actually telling you?
Tip: Don’t just look at metrics – compare with more informal sources of knowledge.
5. To what purposes will the measurement be put? To whom will the measurement be made available ?
Transparency can have a dark side. If we know that data collected on us will be made publicly available then what could be potentially useful is much more likely to distort behaviour. Arguably, it’s the high-stakes nature of Ofsted inspections that causes most of the perverse incentives and unintended consequences. If their inspections were not made public maybe school leaders would have an easier job of improving schools? Similarly, while schools should be held to account for student outcomes, does it help to make these outcomes public? What if individual teachers’ results were published in the same way? Metrics that might really help improve performance if only used internally can backfire badly when they become too transparent.
Tip: Think carefully about what you intend to do with the information you gather. If your purpose is to improve a school then it’s probably not a good idea to disseminate information beyond those who absolutely need to know.
6. What are the costs of acquiring the metrics?
Information is never free. Time spent gathering data is time that cannot be spent on improving performance. Data collection leads to processing, analysis and presentation. Before long the opportunity costs far outweigh the gains and quickly become a distraction.
Tip: Think about what else teachers could do with the time they currently spend collecting, processing and analysing data.
7. Why are performance metrics being demanded?
The demand for metrics often stems from ignorance. If you’re new to a school you may think it’s an important exercise to collect lots of data to find out about it. New heads often spend lots of time observing all their teachers. All this comes with costs paid for by other people. Time spent finding out about a school is time that cannot be spent on improving it. You may think it’s important to collect all this data, but someone else will almost certainly already know what you don’t.
Tip: Wherever possible, promote from within rather than hiring externally.
8. How and by whom are the measures of performance developed?
There’s good evidence that metrics are far more likely to be effective if those being measured have had significant input into developing the metrics they will be subject to. Anything imposed from above will be resisted, gamed, and complied with in the dullest possible manner. When teachers are told to mark books in particular way or start lessons with a particular set of routines, they may well have a better way which they are now prevented from implementing. You will have sacrificed quality for the hollowest form of consistency. Instead, involve as many different people as possible in designing the processes you’ll use for assuring quality.
Tip: Think carefully about how those you’re responsible for managing might react when measurements are imposed. Ask teachers what they think they should be doing and hold them to account for what they’ve said they’ll do.
9. Even the best measures are subject to corruption or goal diversion.
Where there are rewards or punishments there will be drawbacks. All too rarely do those in authority think through the possible unintended consequences of imposing metrics. If you expect teachers to welcome students to each lesson at the classroom door will this make them more likely to ignore bullying or other social problems because they just don’t have time to deal with them? If you ask teachers to record behaviour infractions on a computer system will they end up ignoring bad behaviour because they haven’t got the time to log every incident? These unintended consequences don’t mean we should never measure or check anything, but they do mean we should think carefully about what we might be incentivising people to do.
Tip: Try to anticipate problems and think, What would I do if was a busy, main scale teacher on a 6 period day and someone asked me to comply with this system?
10. What are the limits of what is possible?
Not every problem has a solution and an even smaller number of problems can be solved with metrics. Not everything can be improved through measurement and not everything that can be measure can be improved. Often by gather data we end up making problems seem more pressing without actually coming any closer to a solution. Metrics have their uses but they are but a single arrow in your leadership quiver. Human conversations, patience, humility and warmth may go a long way to smoothing over what databases and spreadsheets only exacerbate.
Tip: Be clear on what you can and can’t solve. You can almost always find ways to make the lives of those you lead more pleasant.
[…] David Didau had an interesting article on metrics. Schools are data rich… but it isn’t always the right data! […]
[…] Alongside Rosling’s Factfulness, this is probably the most useful book I’ve read this year. Muller looks at how metrics have exerted a malign influence on a whole range of areas and makes particular reference to schools, higher education, business, medicine and the military. I wrote a blog summarising some of my thoughts on its applications to education here. […]
[…] Alongside Rosling’s Factfulness, this is probably the most useful book I’ve read this year. Muller looks at how metrics have exerted a malign influence on a whole range of areas and makes particular reference to schools, higher education, business, medicine and the military. I wrote a blog summarising some of my thoughts on its applications to education here. […]
[…] Garbage in, garbage out. Bad or partial data collection leads to biased data. More here. […]