Earlier this week, Labour leader Jeremy Corbyn turned up at the NEU annual conference with some crowd pleasing ideas. The most eye-catching of these was that he would, if elected, scrap SATs, saying, “We need to prepare children for life, not just exams”. Cue rapturous applause from the assembled trade unionists. None of this is particularly surprising, but what does intrigue me is why Corbyn and the NEU want to get rid of SATs. For Corbyn’s part, he says, “SATs and the regime of extreme pressure testing are giving young children nightmares and leaving them in floods of tears.” Of course, this is horrible. No one in the right mind should accept “extreme pressure testing” which give kids “nightmares” and leaves them as sobbing heaps. But the most pressing question here is why SATs are perceived as high pressure tests by 10 year-olds.
The tests given at the end of Key Stage 2 are designed as an accountability measure to determine whether schools have done a decent job of teaching the young people in their charge. Unlike GCSEs and A levels, these tests are not certificated; how well children do on these tests is – should be – irrelevant to the rest of their education. To be clear, there is absolutely no reason why any ten-year-old should feel stressed at being asked to sit a maths and reading test. So why do they? Well, Corbyn went on tell union delegates that he has met “teachers of all ages and backgrounds who are totally overworked and overstressed. These are dedicated public servants. It’s just wrong.” OK. It makes sense that teachers might find SATs stressful, after all, the tests are designed as an accountability measure to ensure primary schools are using public money well. If children do poorly then maybe it’s not unreasonable that government should want to do something about it. Ultimately, school leaders’ and teachers’ jobs are on the line. And if teachers are stressed then perhaps we can forgive (or at least understand) them passing on some of their stress to their students.
Maybe this is sufficient reason to get rid of SATs? The next question is, what – if anything – do we replace them with? Not even Corbyn believes we should do away with any kind of assessment, instead he plans to introduce a “more flexible and practical” primary assessment system. That’s all the detail we have for now. My concern that this replacement system will replace standardised tests with teacher assessment. The bad news is that any expectation on teachers to assess students’ work adds to their workload. If we’re going to ask teachers to work harder we ought to pretty sure that the additional work we’re asking them to do is worthwhile. So, is it? Well, contrary to many people’s intuitive beliefs, teacher assessment is both less reliable and more unfair than standardised testing.
Teachers, like everyone else, are subject to predictable and unconscious cognitive biases. We have decades of research showing that heuristics and biases like the halo effect, confirmation bias, the anchoring effect, overconfidence bias, and many others cause us to systematically overestimate our ability to assess students’ work fairly and reliably. Not only that, in studies where teachers were told that a student had a learning disability, they rated that student’s performance as weaker than did other teachers who were told nothing at all about the student before the assessment began. There’s also evidence to suggest teachers are unconsciously biased against children from ethnic minorities (Burgess and Greaves, 2009). And here’s another study which investigates the bias against children due to race, gender and ability. Conversely, as Suskind & Rasmussen show, we also routinely assume “well-behaved students are also bright, diligent, and engaged.”
Here’s a video of Rob Coe explaining the many problems with teacher assessment:
Of course, we all believe we’re immune from these biases which affect everyone else. There’s a name for that too: the bias blindspot. In one study, only one out of 661 survey respondents admitted to being more biased than the average person! Claiming, “It works for me!” is just further evidence of bias. All this suggests that adding to teachers’ workload by making them assess students’ work is contrary to the aims of social justice.
This is not to say that I’m happy with the SATs. Whilst I don’t have a strong opinion on the maths test except to say that I’d prefer it if the primary maths curriculum restricted itself to children mastering number, the test that I’d really like to see overhauled is the reading paper. This test is possibly the most iniquitous and unfair test children have to sit. It is essentially a test of general knowledge and the more socially advantaged a child is, the more likely they are to recognise the references and vocabulary in the text they are expected to write about. All this really tells us is how leafy a school’s post code is. Instead, we might be better to assess reading fluency as, according to DfE estimates, 20% of children leave primary school unable to decode quickly enough to access an academic curriculum at secondary school. But if we do decide to get rid of SATs, is our only option better standardised tests or flawed and unreliable teacher assessment? Might there not be a third way?
Tim Oates suggests intelligent sampling maybe the way forward. There’s no need to test all students, just a nationally representative sample. And because a test is only taken by a sample, the results are meaningless at an individual level. Students are unlikely to be anywhere near as bothered by them as they would when taking a test with high individual stakes. Of course, for teachers and schools the stakes might be high, but as this can’t be passed on to students it would be less likely to warp the curriculum in the way the current system does.
Richard Selfridge explains how this might be done here. He says,
Whilst accountability, like taxes, are now simply part of life, we can choose how we hold schools to account, and it would be a straightforward task to separate school accountability from the current system of non-qualification pupil testing. … The arguments for high-stakes tests at aged 7 and 11 are entirely outweighed by the arguments against them. There are better ways to hold schools to account.
What is this better way? Selfridge argues that “We need a sensible, low stakes method of tracking pupil performance over time”
A number of countries – notably the USA with its NAEP program, and New Zealand with its Council for Education Research – have well-established systems for tracking regional and nationwide pupil performance. PISA, TIMMS and PiRLS, much admired by educational policy makers in the UK and elsewhere, all undertake survey-based assessments of educational systems around the world. In fact, the UK used to have a system of its own (the Assessment of Performance Unit) before the decision to test every 7 and 11 year was taken in the late 1980s. The 2008 House of Commons Children, Schools and Families Committee recommended just such a system in its Testing and Assessment report, and there is a considerable body of knowledge about using a sample-based survey of pupil performance.
So, that’s my advice, by all means scrap SATs but please don’t further disadvantage the most disadvantaged children by replacing them with teacher assessment. Instead, use intelligent sampling to hold schools to account in a way that doesn’t adversely effect children’s education.
I’m fully in agreement about teacher assessment–not only is it unreliable and time-consuming, but we can argue that it’s highly immoral to force teachers to come up with information which effectively measures their own performance.
I’m also strongly in favour of routine low-stakes tests created by teachers. There was a time when this was standard practice in the US and UK, but you need to be pretty long in the tooth to recall those days.
However, Tim Oates is giving an incomplete picture of how Finland’s system is working out. A Brookings Institution report found that
“In 2011, Finland participated in TIMSS for the first time since 1999. Finland’s 2011 math scores are statistically indistinguishable from the U.S. at both 4th and 8th grades. While U.S. scores have improved since 1999, Finland’s have declined (both changes are statistically significant)…The U.S. press has not told the entire story about Finland. Not everyone in Finland has applauded the emphasis on real world math. In 2005, a petition signed by more than 200 university mathematicians complained that, despite the country’s high PISA scores, students were increasingly showing up for college unprepared in mathematics. An analysis of items on a Finnish matriculation exam revealed a sharp fall off in computation skills, particularly with problems involving fractions and exponents.”
Like you, I think the Arithmetic Paper is one of the few parts of our existing SATs which is fit for purpose. But I think there are two powerful arguments against sampling: first, kids need a bit of stress now and again. I’m fed up with educators using ‘mental health’ like a beggar’s sore. Secondly, the bias against any kind of testing–even low-stakes tests–remains incredibly strong. This crucifies low-ability pupils, who can recall little if anything each time a subject re-appears in our spiral curriculum.
I’m not sure your arguments against sampling stack up. I’m not using a ‘mental health card’ – I too think students need a bit of positive stress – eustress – but the current high stakes of KS2 tests for schools along with the lack of purpose or certification for children make it rife for gaming and malpractice. What would be better, especially for building students’ storage strength, is regular in class retrieval practice alongside a system which harder to game and has less of a negative effect on the Yr 6 curriculum.
Also, to be fair to Tim Oates, he really isn’t an uncritical fan of Finland, as clear from this: https://www.cambridgeassessment.org.uk/insights/finish-fairy-stories-tim-oates/