Intellectually, philosophically, morally, the argument over whether teachers’ performance should be evaluated by grading their teaching by means of a lesson observation has been won. Ofsted have accepted the crushing weight of evidence that, despite what some people may choose to believe, there is no validity or reliability to such a grade. Unsurprisingly, there are many benighted souls who choose wilful ignorance over enlightenment and insist on continuing a practice which has less accuracy than a coin toss.

Last week the TES published an article from just such an individual arguing that grades were still a good idea in the Further Education sector. Maybe there might be ways to evaluate teachers’ performance more reliably by carefully considering research findings  but the best information we have suggests that the only way to get even a vaguely accurate picture of an individual teacher’s quality is to have a minimum of four different observers each observe a teacher on at least four separate occasions.

Variation in the reliability of teacher ratings withnumber of lessons observed and number of raters (Hill et al., 2011)

Variation in the reliability of teacher ratings with the number of lessons observed and number of raters (Hill et al., 2011)

How often does that happen? If you believe you can predictably make accurate judgements on your own in a single observation you are deluded. You might as well sacrifice a chicken.

To underline the ludicrous nature of the level of argument in favour of retaining graded observations I’ve reworded the TES article replacing ‘graded lesson observation’ with ‘chicken sacrifice’:

In May, Ofsted announced that it was going to stop sacrificing chickens. This was a big deal: until then, chicken sacrifice had been a critically important way of measuring teaching staff’s performance. So what now?

Like all colleges, we at Basingstoke have learned to adapt quickly to meet current best practice or Ofsted’s latest update. However, on this occasion, we won’t be in a hurry to ditch chicken sacrifice. This is for a number of reasons.

Our governors pay incredible attention to the sacrifices; they go through the chickens’ entrails in detail on a monthly basis. If the numbers do not meet our targets, we have to explain why. Even if Ofsted is no longer sacrificing chickens, we still need a performance measure to demonstrate what’s going on in the classroom to our governors.

We also feel it’s important for our staff and their managers to understand how they are performing, for their own professional development and to maintain quality. A chicken sacrifice, alongside their developmental observation and action plan, provides that benchmark.

If a chicken sacrifice is scheduled to take place, we notify staff the week before it’s due. We use an external team for the chicken sacrifices and to validate our own internal judgements; these are still really important to us.

To further improve, we believe we need to inspire and support our teaching staff, so we now supplement chicken sacrifices with developmental observations. Team leaders provide supportive feedback and encourage staff to reflect and improve on their practice.

At the same time, we’re trying to create a culture of open access via learning walks. We have an open-door policy: teaching staff are aware that someone could sacrifice a chicken in their classroom at any moment.

We’re also further developing our in-house professional observation team, so we will have quality assurance of our own performance. This should help to make chicken sacrifice an insightful, respectful and engaging process. And we are encouraging our in-house team to take coaching qualifications; this will contribute to their own professional development as well as the development of the college. The next step will be to look at the most effective way of supporting and coaching our staff to reflect on their own practice and share it with colleagues.

So, for now, chicken sacrifice will remain an important part of the college’s quality measures. But who knows what the future will bring?

I’m sure you noticed that at no point was any argument advanced in favour of sacrificing chickens, or indeed graded observations, other than we like it and find it useful for… something. I’m sure I’m not alone in finding this reasoning insufficiently compelling.

If you’d like to read more on the consequences of grading lessons, these posts might make a useful starting point: