Proof of progress – Part 1

//Proof of progress – Part 1

Measuring progress is a big deal. I’ve written before about the many and various ways we get assessment wrong but, increasingly, I’m becoming convinced there are some ways we might get it right. As regular readers will know, I’m interested in the potential of comparative judgement (CJ) and have written about it here and here.

Greg Ashman mentions the process obliquely in his new book:

When we measure on an absolute scale using a set of criteria, we introduce the possibility of all students scoring 9 or 10 out of 10, particularly if we have trained them well. However, what is really of instructional value are the differences between essays that score 10. What makes the best essays better than the next best essays? We won’t even know there is a difference if they all score 10.

A way that we can do this is to force a comparison. We can lay the essays out on a large table and start to rank them using our expertise; our concept of quality. Once we have a rough ordering of the essays, we can start to ask: What makes these ones better than those ones?

… a useful check would be to intersperse essays of calibrated quality – perhaps from an external examination or scored by a different school or group of schools – into the ranking. Schools could collaborate to make this work. There are even computer programs now available that help teachers to rank essays by picking the better paper from a pair of papers.

For all our talk about progress in schools, most of our data is garbage. Anything based on individual judgement using rubrics will be unreliable (markers suffer from quite predictable unconscious bias) and invalid (rubrics cannot adequately describe expert performance.) If you want meaningful information about how your students are actually doing then you need to collect data differently.

Like anything else, there are different ways to approach anything and I was really interested to read this description of an English department’s attempt to use CJ to forensically investigate students’ progress. But, at what cost?

The process of reading twenty pairs of essays took about three hours for each judge, with the median time per judgement varying lying around the five minute mark.

This is totally at odds with my experience of CJ. Dr Chris Wheadon suggests that judging should be seen as fundamentally different to marking:

…the transition away from marking takes time. Judging is instinctive, so should be quick and easy. We’ve found that English teachers can make good judgements about GCSE essays in 7 seconds, with a median time of 30 seconds. The process is hugely slowed when you ask judges to take notes – which is extraneous to CJ. The problem with note taking is that teachers tend to slip back into a marking mindset rather than staying in the judging zone.

This is the approach that I’m interested in trialing – can teachers get reliable and valid data about pupils’ progress through a process which takes substantially less time than that required by marking and using rubrics. And this is exactly what we’ve decided to do at Swindon Academy.

Full disclosure: Chris Wheadon of No More Marking has waived a fee to work with us on this trial and in return I have agreed to blog about the experience.

So, here’s the set up. We’re going to start by judging our Year 5 students in English. I met with Year 5 teachers to find out what students had been reading and how assessments were usually structured. This is typical of the sort of thing we found in students’ books:

Screen Shot 2016-01-30 at 14.16.42

We agreed that the task on which they would be assessed would require them to read a short extract from Michael Morpurgo’s version of Beowulf and then write a description in response.

Obviously, we want to enable our students to write as well as possible and so decided the assessment task should not be too unfamiliar. Here is the question we agreed on:

Screen Shot 2016-01-30 at 14.03.31

Once students have completed the assessment, the next stage is for Chris to give teachers some training on how judgements should be made. Primary teachers and Key Stage 3 English teachers will then go through the process of using CJ to build up a, hopefully, accurate picture of exactly where the students are.

One problem we have anticipated is that as teachers we are so used to using rubrics to direct teaching that we may end up looking for those aspects of writing we have taught rather than looking at the writing in front of us. In order to test this out we’re paying the paltry sum of £100 to tap into Chris’s network of expert markers (all of whom in this case will be students completing PhDs in creative writing) to see if their perception of good writing is different from ours.

We’ll have to wait and see…

2016-03-10T23:03:10+00:00January 30th, 2016|assessment|


  1. manyanaed January 30, 2016 at 2:34 pm - Reply

    That is almost exactly how I used to organise the grading and selection of project work from schools that we used to set the grade boundaries for GCSE ICT. We were working with a couple of hundred projects of many pages each. I was the principal moderator. The ranking process, grades G to A* was quite quick and it was easy to see any projects that did not ‘fit’. What we then spent a great deal of time on was analysing what we had each used to make our decisions about why that project was a tad better than the next one.

    • David Didau January 30, 2016 at 2:36 pm - Reply

      Thanks Peter – interesting to know.

      I confidently predict that CJ will replace traditional exam marking for all essay based subjects within the next 5 years.

      • julietgreen January 30, 2016 at 2:54 pm - Reply

        I look forward to it too. The myth of ‘reliable’ teacher assessment through such things as criteria and rubric still abounds. It’s everywhere in the literature from the DfE and other sources.

  2. Michael Tidd January 30, 2016 at 2:45 pm - Reply

    I think your point towards the end is key: primary teachers are so well-used to the old APP criteria and similar approaches, that many may long have lost a real sense of what makes writing good, and instead could resort to identifying features from such lists.
    I know this, because I’m aware of it in myself!
    It looks like a fascinating project – I’ll look forward to reading Part 2!

    • David Didau January 30, 2016 at 2:54 pm - Reply

      Thanks Michael – I’m really looking forward to seeing if my expectations are met or confounded.

  3. thom gething January 30, 2016 at 3:17 pm - Reply

    We have been using CJ on a small scale in our English department in Y7 and Y11 this year. It is early days but talking with the team they found it quick to do and the results tended to reinforce the views they already have of the students’ attainment. What I fund encouraging is that for their departmental CPD goal for the rest of the year the team have decided to use some of the material from the CJ work to focus on improving writing through modeling.

    What I like just as much was seeing one of our Geography teachers leaning over to take a look at the results which our English HoD and then discussing similarities and differences in the view on students. It was one of those off-the-cuff staffroom conversations that you always like to see.

    My hope is that by the start of next year we have a better understanding of how CJ might work for us, particularly as we are a bilingual school and we need to do a better job of raising expectations in both English and Spanish.

  4. fish64 January 30, 2016 at 3:18 pm - Reply

    Anything that gets pupils away from a dry diet of examining mark schemes and criteria and on to focussing on the content of what is being taught is a good thing. But, as you say, a whole generation of teachers and school leaders cannot conceive of anything different……

    • David Didau January 30, 2016 at 5:06 pm - Reply

      This is the future!

      • Greg Wright January 30, 2016 at 5:21 pm - Reply

        It’s also the past. This is how we marked GCSE Technology projects c.1990. As always in education, if you wait long enough it comes around again.

        • David Didau January 30, 2016 at 5:48 pm - Reply

          Yeah. The bit you were missing in the 90s – and the bit that will see Ofqual push this method forward – is the ability to aggregate judgements to massively increase reliability.

      • Scott Williams January 30, 2016 at 5:36 pm - Reply

        A very interesting proposition. I agree that traditional mark schemes can be damaging. It’s terrible when students produce excellent answers showing a thorough understanding but get no marks because ‘it’s not in the mark scheme’. It would also remove the time spent by schools teaching students what examiners are looking for. Teaching to the test!

        Could there be a move towards CJ for the majority of subjects by getting students to show understanding through extended writing?

        How do you report back to students? Is it a grade or a rank?

        • warrenvalentine January 30, 2016 at 7:07 pm - Reply

          This is the core question for me, how do you report back to students and how far would you hive this off from the feedback and marking process?

          David, how would the aggregate judgements be aggregated? I must confess that I’m not entirely sure how Comparative Judgement would be applied to national external exams.

        • David Didau January 30, 2016 at 7:49 pm - Reply

          We can to choose to report back as either a grade, a rank or both. I know Chris is looking at ways to assess maths through essay responses but this is quite controversial.

          Warren: rather than me explain the process again here, it might be easier if you read some of my previous blogs I linked to in this post.

  5. heatherfblog January 30, 2016 at 7:55 pm - Reply

    CJ sounds very reliable. There is still a fundamental problem with presuming it will show generic ‘progress’ when performance is so context specific, tied to grasp of that particular piece of literature.

  6. Jane Considine January 30, 2016 at 8:52 pm - Reply

    I would be really interested in getting involved in comparing written independent work of pupils and have access to enormous amounts of work. Is there a option to stick a penny worths in?

    • David Didau January 30, 2016 at 9:18 pm - Reply

      Yes, of course. Get in touch with Chris at

  7. […] Measuring progress is a big deal. I’ve written before about the many and various ways we get assessment wrong but, increasingly, I’m becoming convinced there are some ways we might get it right. As regular readers will know, I’m interested in the potential of comparative judgement (CJ) and have written about it here and here.  […]

  8. Kelly February 1, 2016 at 8:18 pm - Reply

    Interesting stuff, will share with my team. I’m finishing it harder to think about how to do this with assessing reading though, as it wouldn’t be a piece of written work as such. Would it be talk about a child and how we see them as a reader and then making comparisons with other children in the class or across the year group?

  9. […] 30th January – Proof of progress – Part 1 […]

  10. […] exemplars which make success more tangible to students and against which, ultimately, we can make comparative judgements of both the quality of their reading and writing […]

  11. […] Back in January I described the comparative judgement trial that we were undertaking at Swindon Academy in collaboration with Chris Wheadon and his shiny, new Proof of Progress system. […]

  12. […] Part 1 of this series I described how Comparative Judgement works and the process of designing an […]

  13. @blueprintteach November 26, 2016 at 11:06 am - Reply

    Brilliant thanks for this David. I am interested in this from a “measuring progress” angle? Trying to get my head around an Ofsted situation. I presume it would make for reliable and valid school data that can show how various groups have made progress. It would also free up teacher time to invest in meta-cognitive activities with pupils instead of marking and for teachers to develop their pedagogy etc. Comparing with other local schools and nationally could also help teachers to see what their own pupils need to be doing to get a better result. To be honest I can only really see positives for CJ use. However, it probably would be best used with Ipad technology. Would you believe some schools still do not have many knocking around!

  14. […] by Daisy Christodoulou about comparative judgement. Then another excellent article from David Didau here (there are two further parts that are a must-read). I began blabbering uncontrollably about CJ to […]

Constructive feedback is always appreciated

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: