…it is the peculiar and perpetual error of the human understanding to be more moved and excited by affirmatives than by negatives… Francis Bacon
Today’s post has been contributed by a reader who has asked to remain anonymous, but got in touch after reading my blog explaining why I’d abandoned the SOLO taxonomy. Whilst this post isn’t directly related to SOLO, it does address the need to provide compelling evidence when we start getting excited about a particular style or approach to teaching. Increasingly I’ve become convinced that one way to increase students’ attainment might be to harness some sort of permanent Hawthorne Effect by telling them that they are the focus of a series of cutting-edge interventions. One problem with this theory was that I couldn’t see how I might come up with endless new strategies to perpetrate. Surely, I reasoned, a condition for producing a Hawthorne Effect must be that the teacher would have to be convinced of the efficacy of the intervention? Well, maybe not. Read on…
My study
My study was fuelled out of my personal lethargy towards yet another initiative, without any convincing evidence that we needed to actually do anything.
Background
At the end of 2004 the latest Inset trend concerned addressing the gender gap, and specifically looked at boys’ under performance at KS3 and KS4. Staff members were presented data that showed a difference in the mean scores for average KS4 points score – with girls achieving a higher mean than boys. The conclusion was presented to us that this difference mattered and we needed to do something about it.
No ‘proper’ stats were used to quantify the significance of this difference. So, I went back to the raw data and analysed the impact of the following factors:
- Gender
- FSM
- Originating primary school
- KS2 english / maths / science results
- KS3 english / maths / science results
- Reading age
- Learner attendance
- Teacher attendance
- SEN / MAT / EAL
All had an impact ‘on average’, but the most significant factors were:
- Teacher attendance
- Pupils’ attendance
- KS2 English results
All of which were significant to p=0.001 or less.
Gender and FSM were the least significant factors of those measured.
So armed, I went back to the Headteacher and said (words to the effect) “I’m not putting into place a scheme to address gender differentials in science as other factors are more important.” The Head’s response was to explain that this was not a request and that instruction came from the Director of Eduction in the LEA.
I returned to my desk and wrote this up my findings and sent them to the Director of Education who declined to respond.
Time moved on, and the whole school gender ‘issue’ continued apace, this time supported by ‘evidence’ from Professor Dave Egan (Cardiff University and adviser to Welsh Government). Perturbed, I contacted Dave Egan to express my worries over all this – only to get his agreement that his data had been taken out of context and he had never proscribed anything so draconian. Sensibly he suggested that teachers should act only if the evidence in your school showed an intervention was necessary. Sadly this had been lost in translation, and schools were mandated to “have a gender differential policy”.
Move forward to the start of the next academic year.
The science department had conclusive evidence that gender difference was amongst the least important factor impacting our pupils’ performance. Nevertheless we compelled to discuss how we would fix this non-existent problem and implement a solution. I continued my data exploration and surveyed all the KS4 pupils for:
- Odd / Even house number
- Games console ownership
- Left or right-handed
- Gender
As anticipated, all four factors when considered as averages had an impact on the outcomes of KS4 results. In order of significance they were:
- Odd / Even house number
- Games console ownership
- Left or right-handed
- Gender
So living in an odd-numbered house had greater impact on your GCSE results than your gender.
I wrote up my findings and presented them to SLT. It was treated as a ‘bit of fun’. One SLT member looked a bit worried and asked, “You’re not seriously expecting us to buy all our students a PS3 are you?”
Inspection was looming and the school needed to demonstrate effective monitoring of data. I was basically told to “wind my neck in” and “play ball”. In order to show that the school was “research based” and was “putting in place appropriate interventions”, I conceived the following experiment:
Year 7 cohort, 6 form intake. Mixed ability form group classes.
Classes 1 & 2 taught by teacher A
Classes 3 & 4 taught by teacher B
Class 5 taught by teacher C
Class 6 taught by teacher D
All classes were taught the same curriculum topics at the same time, following the same scheme of work. Classes 5 & 6 did not take part in the experiment.
Classes 1 and 3 where selected for the ‘intervention’
Classes 2 and 4 where selected to be “control classes” with no interventions
These classes were chosen so that we could keep teaching as consistent as possible.
The intervention consisted of informing the classes that they were “part of an experiment to try out new teaching ideas” and that they would be “monitored closely”. A letter was sent to parents informing them that their child’s class “had been selected to trial a new science scheme of work” and that we would be “updating parents at the end of the study.”
That was it. Nothing else changed between the classes. The only intervention was telling the learners that there was an intervention.
All the Year 7 pupils were assessed before and after the intervention Pupils in the classes that received the pseudo intervention achieved on average 2 sub levels of progress – whereas the control classes only achieved 1.5 sub levels. This was significant to p=0.005.
Importantly, this intervention was more significant than the gender split that I was expected to ‘do something about’.
Our conclusion was that telling pupils that they were part of an experiment, that they were special and that they were receiving some extra attention produces an impact. So, armed with this wealth of data and interesting evidence, we started the next school year. But did it make a difference to school priorities?
No. The school continued to mount expensive, time-consuming interventions that focussed on gender, FSM and pupils’ levels of literacy and numeracy.
My point (which never seemed to get any traction) concerned proper statistics – especially analysis of variance and significance. Any measure where you split learners into two groups will always produce a difference between the two groups when you look at the average of the data set. Only by analysing variance between groups and significance is it possible to determine if this difference is worthy of acting on. Even obviously meaningless splits such as left/right-handed, odd/even houses or fake strategies will show a difference on average.
It’s a sad reflection on schools and their relationship with data that this exercise could probably be repeated pretty much anywhere and would likely get similar results.
Reminds me of a BECTA claim that children with a computer at home did better at GCSE than those without to justify the national laptop programme. Probably those with an SUV on the drive would correlate similarly so should we buy all families SUVs 🙂 Strange how we wring our hands about the state of education then those in charge of it continue to show they are mathematically and scientifically illiterate without a trace of embarrassment.
[…] …it is the peculiar and perpetual error of the human understanding to be more moved and excited by affirmatives than by negatives… Francis Bacon Today’s post has been contributed by a reader who has asked to remain anonymous who got in touch after reading my blog explaining why I’d abandoned the SOLO taxonomy. Whilst this post […]
Brilliant and tragic …
Seems to relate back to expectancy and the impact of aspiration. In this case the students’ and in other the teachers’ expectancy of greater success led/leads to just that. If I was to draw a conclusion from this or other evidence based studies based on the idea expressed here as a placebo is that we’re actually discussing a mindset where we feel we can achieve more delivering so maybe we should cut through any statistics and research about effect sizes of anything from phonics to te use of TAs and discuss the impact of empowerment in the classroom as a key factor in enabling all to achieve more.
The results from this kind of study should be shouted from the rooftops, in my humble opinion. It’s a perfect demonstration of the problems in analysing ‘data’ for something as complicated as institutional education, and that’s before you even get to concerns which could be raised such as the use of small data sets, non-random variables which aren’t independently and identically distributed and the vailidity of levels as a measure of knowledge and understanding.
Mind you, the fact that the writer can use p values suggests that s/he understands data at a level beyond most in education. It would be useful to know the size of the samples/cohorts (I’m guessing 30 children a time – which has scope for huge variation in reults anyway) and the confidence intervals for the p values.
The last two paragraphs in particular are worth stenciling to the wall.
“My point (which never seemed to get traction) concerned proper statistics – especially analysis of variance and significance. Any measure where you split learners into two groups will always produce a difference between the two groups when you look at the average of the data set. Only by analysing variance between groups and significance is it possible to determine if this difference is worthy of acting on. Even obviously meaningless splits such as left/right-handed, odd/even houses or fake strategies will show a difference on average.”
There will *always* be differences between samples, and if the ‘data’ doesn’t warrent the use of statistical analysis – as is the case in just about all education ‘data’ – the results of any ‘data analysis’ will tell you next to nothing.
A companinion study might be this one: https://www.sree.org/conferences/2014s/program/downloads/abstracts/1137.pdf in which the authors ‘find that – simply due to chance – teacher effects can appear large, even on outcomes they cannot plausibly affect’ (Teacher Effects on Student Achievement and Height: A Cautionary Tale; Marianne P. Bitler, University of California – Irvine, Sean Corcoran, New York University, Thurston Domina & Emily Penner, University of California – Irvine).
I’d love to see any other example of this kind of research.
“confidence intervals for the p values”?
Good point, Luke: p-values don’t have ‘confidence intervals’ – it’s a long time since I wrote this, and I’m not sure what I was getting at here, I’m afraid…
But how do you quantify classroom empowerment such that you can analyse it statistically? The same problem occurred to me ealier in the post about teacher and pupil attendance: is there a sliding scale of effect which is statistically demonstrable, or some arbitrary cut-off point?
One of the issues with initiatives is that they are often born when new appointments are made in schools. When you are promoted, there is an expectation that you will introduce SOMETHING. I remember a seasoned Head of Subject telling an enthusiastic young Assistant Head – ‘I know what you’re trying to do son. Don’t worry. I’ll do whatever you like. I’m retiring in a few years. I can tell you that I’ve been at the cutting edge of education more times than you’ve had hot dinners!’ Brilliant. That Assistant Head was me and I’ve never forgotten it.
This reminds me that a specialized education staff plays an important role in the formation of the student.
http://careers-schools.com
I was 28 when I was appointed head of science in a large 11-18 comp. It was a good traditional school with the usual dose of cynics. Computers were new then and I wrote some software to store and analyse all our assessment stuff on a BBC B that I brought in from home. Everyone accepted that initiative because it saved them a lot of tedious paper work. So I’d say the proof of the pudding is in the eating. If it saves people time, improves their quality of life they’ll generally do it. What is cutting edge and effective is not always possible to determine until it is tried. As Edison said when his assistant was moaning about lots of failures in testing inventions. We now know 10,000 things that don’t work. Key is in letting go of things that don’t work but not to just to dismiss anything new with cynicism either.
[…] Didau (@LearningSpy) has posted an article on his blog from an anonymous teacher which is worth reading. The teacher seems to be of the most dangerous […]
Thank you for posting this, David, and to your contributor. This is a great reminder that teachers are expected to become more research-informed. Hope you don’t mind, I’ve cross-linked to this post from here: http://sptr.net/blog/impact-of-gender-in-school-science/
I am slightly confused by what you have done here, though the general conclusions seem correct. How exactly did you analyse the competing possible predictors of KS4 scores? A multiple regression or…?
Did the author ever try to get this published in a peer reviewed journal?
[…] Transparency comes from unpacking the decisions that are “evidenced” by the opendata, or other data not open, or no data at all, just whim (or bad policy). […]
Useful guide to evaluation of school action research projects here: http://educationendowmentfoundation.org.uk/library/diy-evaluation-guide
we’re promised a ‘more user friendly’ version later in the year but this might be a good starting point for colleagues wishing to evaluate impacts of interventions etc.
Yes, the entirety of school accountability is based on ‘the emperor’s new statistics’. I spend days of my summer holidays producing colour coded charts, working out differences in averages, and even enjoying losing myself in excel madness. Every now and then I’d remember that it was all complete b*ll***s. But the charts looked pretty and something that involved so much number crunching felt that it must have some sort of rigour. I’ll be doing it again this summer no doubt.
So what is a better way to do it? Or do you think accountability is unnecessary?
No accountability is not a good thing – but spurious accountability is no better.
So what do you suggest for a better system?
Nothing that I suspect you would find acceptable. But are you really suggesting that wrong answers are better than no answers?
Making people accountable for things they have at best limited and imprecise control over is hardly reasonable. There are plenty of education systems in the world that see no need to reduce teachers’ roles to that of back-watching number-crunchers.
Why make wrong assumptions? My main criticism of OfSTED is its easy to find faults much more difficult to fix them. I generally don’t believe in complaining unless I think I have a better solution. Then if I think its important I make an attempt to implement it.
Point taken – and anticipated. But you asked if I have a solution – and I suspect that the only solution to the problem of flawed accountability is either to get rid of it, or dilute it to a point that it matches the level of control that those people actually have. But I think that many who tend to advocate it (as you appear to do) would not find that acceptable.
For what it’s worth, my own experience of ‘accountability’ (and that of many colleagues) is that it is a millstone around conscientious people’s necks, that more often than not impairs their performance (the less conscientious may tend to care less to start with…). If it’s really badly constructed, the anxiety it generates pushes otherwise honest people to game the system just to protect themselves. Is it really worth that?
Controls generally constrain the highest performers while bringing the weakest up. That is a dilemma and one of the problems with any regulation. It tends to compress to the middle. It’s a political decision as to whether or not the benefits outweigh the drawbacks. I was just asking some questions. I wasn’t advocating any particular system. Its unlikely large scale public services will ever escape some sort of bureaucratic regulatory system. Even the private sector is not immune and look at what lack of regulation led to with the banks. I’m subject to OfQual regulation so not immune either.
[…] to resolve the hard problems of education within the context of their school. David Didau relates a perfect example of this mindset in a recent blog […]
Hi David. I read this blog a while ago. It makes sense to any teacher who has a modicum of common sense when asked to intervene using spurious data evidence.
[…] a well-known fact that boys underachieve. Every statistic tells us so. But ever since writing this post I’ve been suspicious of gender as the root cause for differences in achievement. Yes, girls […]
[…] Others seem logically connected so we pay them great heed. This leads us to see things that aren’t there, like for instance the ‘fact’ that sex is a significant factor in achievement. […]
[…] …it is the peculiar and perpetual error of the human understanding to be more moved and excited by affirmatives than by negatives… Francis Bacon Today’s post has been contributed by a reader who has asked to remain anonymous, but got in touch after reading my blog explaining why I’d abandoned the SOLO taxonomy. Whilst this post isn’t directly […]