Pathways towards a properly functioning market in qualifications

A recent collection of essays published by the Centre set out, among other aims, to draw salient factors from the mix of those which may relate to England and Wales’ long-term upward trend in 14-19 exam grades. This phenomenon has been taken by the present government to offer both convincing evidence of the failure of the qualifications market, and a pretext for moves to rationalise provision and raise the demand level in an effort to improve the utility and currency of qualifications. In that elements of its proposals remain ‘unfinished business’ for those recently appointed to the ministerial brief at the DfE, and their successors, a lot rests on how we interpret the trend. Contributors to the volume explored the deteriorative effects of the overuse of assessment and qualifications as levers of education reform; the pressures of the accountability framework; and, underpinning it, the effects of comparability, on standards, largely agreeing that these have been the most important factors influencing what Tim Oates of Cambridge Assessment refers to as ‘grade drift’ in a criterion-referenced grading system, which many in assessment consider to be inherently biased towards gains. In the interests of comparability, as Dale Bassett of AQA caste it in his essay, governments over several decades have sought to standardize provision in every possible aspect, with a special focus, increasingly, on the form and content of national assessment. Their efforts, contributors concurred, have been in many ways counter-productive. The pressures of working with this political consensus, the trials and the errors, and of compliance with the increasingly micro regulation that is the logical outcome of the standardisation agenda, have compromised quality in important ways. It may indeed be the case that fewer young people today fail outright in academic terms as a result of this standardisation, but the burden of regulatory compliance has also had a narrowing and homogenising effect, leaving little room for variance between board offerings in content and none, relatively (at least officially), in terms of level of difficulty. This has reduced the scope for students to explore and specialise, discover and demonstrate their particular strengths and aptitudes. So, in reference to the popular Govian view, it is not that competition depresses standards; but rather that competition under these conditions undermines efforts to improve quality. The significance of this collective contribution should not be underestimated, offering as it does the prospect of an identifiable consensus around the nature of the problem and the way ahead. But there was another component of the (it must be said rather ambitious) brief we gave to contributors – namely, what can and should be done to address this – and here they were, understandably, rather more cautious. Broadly, while some contributors (Len Shackleton of the University of Buckingham, Geoff Holden of City & Guilds, and Tim Oates) felt that system stability – essentially involving government restraint – should be the immediate priority, all argued (or implied) that reform of the accountability framework was ultimately necessary with a view to better alignment of incentives with intended outcomes. Stressing how little we know about the importance of different features of the accountability system, how these features interact, and how they shape incentives and impact on outcomes, Tim Oates, and Robert Coe (with my colleague Gabriel Heller Sahlgren in a co-authored piece), advised that we trial different approaches to accountability first to find out which works best. Nevertheless, contributors broadly anticipated that taking the focus of accountability off qualifications (or in Coe and Sahlgren’s case, designing assessments specifically for accountability purposes, on the basis of the evidence so gathered) would have the effect of relaxing comparability requirements and allowing for reduced prescription in respect of the structure and content of qualifications and the way in which they are assessed and graded. Divested of the strictures of accountability, and the attendant demands of comparability, boards, and indeed the regulator, would be able to give themselves more fully to the task of ensuring the utility of their qualifications with respect to learners’ needs and end-users’ requirements. Two approaches to reform of the accountability framework in particular emerged, however, which are worth further consideration in terms of their impacts for the way we do assessment I think. The first, Coe and Sahlgren’s, essentially involves (for the time being) accepting the validity of the range of purposes that national assessments are expected to fulfil, whilst recognising the need for them to be sorted out and defined, and requiring that boards be up-front about these in an effort to improve design and develop criteria for judging their effectiveness. The second, Bassett’s, advocates re-modelling accountability to take account of other (longer-term) outcomes, in addition to assessment results – as they relate to pupils’ long-term success beyond school – with a view to alleviating the direct pressure that comparability exerts on educationally desirable practices. The government’s Open Data drive is currently working towards the routine publication of data, via the National Careers Service, of data showing learner progression, the success of past students in securing employment, and earnings, to give them the information they need to make informed choices about their future. Bassett believes that in the future this could open up opportunities for the development of school-level performance measures, which take account of background data, and which would offer potential for assessing value-added. Others, including Heller Sahlgren in his book, Incentivising Excellence, have argued that even the addition of reputational indicators, such as applications data from school admissions and the results of parent satisfaction surveys, would help mitigate this pressure on national assessments. Coe, however, makes the point that we cannot anticipate their impacts on the overall incentive structure. There have been many confident efforts to improve performance indicators that have introduced new and unanticipated problems. Coe tabulates features of hard and soft accountability frameworks and argues that these could be combined in various ways, each of which should be piloted in competition with one another to help determine the accountability framework most likely to succeed in inducing system-wide improvement. He believes that there are ways of ensuring that the demands of accountability do not undermine what is educationally desirable in assessments – through attending to perverse incentives, policing measures designed to identify and deter abuses, and the like. Even so, it is hard to envisage any system that might emerge at the end of this piloting process escaping the burden of comparability that accountability places on national qualifications intended to give all learners equitable access to educational capital. Even if the process should succeed in reconciling such political considerations with the interests of learners, or at least a situation in which we learn to live with trade-offs between the different purposes assessments serve, it will be impossible to prevent misuse, as Paul Newton acknowledges in his important paper on the subject. It’s more likely that his piloting process would result in a multiplication of assessments. (Coe’s starting place is that we must accept the validity of the range of purposes to which national assessments are put. He further acknowledges concerns that loading up assessment with too many different purposes might inevitably lead to compromises of fitness.) With the focus of government attention ever on qualifications – the assessments that matter most for students and for which they (and their teachers) are (and would continue to be) maximally motivated – it seems to me that the best we can hope for if we follow this track is a fudge where these dual high-weight purposes, of assessing the contribution of schools for accountability purposes, and student qualification, co-exist in tension. This is essentially what we have now. An alternative, advocated by Heller Sahlgren in his book, and which I think remains worthy of further consideration, is to recognise the essential insight here – that unacknowledged multiplicity of purpose probably inevitably compromises fitness – simplify, and adopt the American way (broadly following the recommendations of the Sykes Review), with SATs serving, albeit inadequately, both university (and to a certain extent FE) selection purposes, and with stronger reliance on the judgements of institutions. In this way qualifications might be freed from the ‘national’ framework and refocused to learner and end-user needs. This would of course open the way for further diversification in qualifications too, so that schools could adopt EU curriculums and qualifications already recognised by British universities. Such an approach to assessment policy would be highly compatible with the overall trajectory of devolutionary reforms in education intended to foster school autonomy and increased responsibility for self-improvement. It would also be compatible with the kind of wider remodelling of accountability that Bassett, and Heller Sahlgren, advocate. I want to conclude with some comment on why I think Bassett’s approach is most promising and then to offer a few brief remarks on the future of qualifications beyond comparability. The chief reason why I think bringing destination and other long-term outcomes data into the accountability equation is a good idea is that it takes the discourse about improvements to educational opportunity beyond the confines of artificial comparability measurement and into the real world. For while comparability measurement in the context of the national curriculum and qualifications project is essentially about managing features of the real world – features such as uniqueness, complexity, value conflict and instability – through the application of statistical controls, you can’t manipulate real world returns to education. In so far as we may talk about the design of the present system, everything follows from the equality agenda: system rationalisation, regulatory development, reforms to the form and content of national assessments. Because the whole purpose of national qualifications is to guarantee fair access to educational opportunity, an elaborate system of checks and balances has developed to ensure that there is no advantage in candidates being assessed by one board over another. Because we need to be sure that students are not disadvantaged by their choice of subjects, and these are really fundamentally incomparable, much political capital is invested on efforts to define what the constituent elements of a common core curriculum ought to be. In the same way, diversity either in the means of assessment, or among qualifications designed to assess different aptitudes and competencies cannot ultimately be accommodated in this context. The monitoring of standards over the long-term, meanwhile, is essentially about creating a plausibility structure for an acceptable level of variance. Things are arranged this way because politicians are judged on outcomes, not opportunities. My point is that the political framework creates the circumstances of the conceptual collapse of equality of opportunity into equality of outcome, which in turn is what has, over time, given rise to the comparability framework. But it doesn’t have to be this way. Delivering equal opportunity does not represent the realisation of anything that is inherent to the essential purposes of assessment. Neither does it follow that if boards, and for that matter the regulator, were enabled to give greater heed to those essential purposes they would be less concerned to fulfil a social responsibility to provide assessment for all, or fulfil that role less effectively. In our mature market context, one would expect awarding bodies to strive for maximum accessibility in their general qualifications with a view to maximising market share, but also to differentiate their offerings more, and also potentially develop niche qualifications also. With full ownership of their brands restored, under more competitive conditions, one would expect boards to take positive reputational risks on quality improvement – which they are presently disinclined to do. If the essential purposes of 14-19 assessments are to aid the process by which schools help young people explore, identify, develop their abilities, aptitudes, motivation and potential, and to evidence them for end-users, then above all we need to start adopting a more transactional approach to educational opportunity. Providers need to offer diverse qualifications, across a wide range of subjects and vocational pathways, employing different assessment methods, and pupils need to be equipped and given scope to choose which will serve their goals and ambitions, and to take responsibility for the outcomes. The direction of travel of the measures proposed by Bassett and Heller Sahlgren would seem to encourage this change of mind-set, and for this reason, they get my vote. James Croft

