New Zealand Qualifications Authority
Portal links...
New Zealand qualifications

Secondary school statistics

Limitations and Cautions

Qualifications statistics alone seldom answer questions. Insightful interpretation is usually needed, otherwise the statistics by themselves may mislead rather than inform. This has always been true: it is not something peculiar to the current system.

Being confident about conclusions

There is always error in measurement; even when large numbers of students are involved. The smaller the number of students, or the shorter the assessment, the lower the confidence in a conclusion. Results for individual classes and departments are likely to have a relatively high margin of error, with limited confidence in any conclusions. The higher the stakes riding on any decision based on data analysis, the higher the confidence you need that the data are based on enough cases to be reliable.

Measuring over time

Confidence in results that have limited reliability can be improved by using several sources of data: for instance, data patterns repeated in successive years. Data about the same issue drawn from several unrelated sources are also more powerful: consistent results increase confidence; inconsistent results signal that no reliable conclusions can be drawn.

Compare like with like

It is essential to compare like with like. For instance, external exam results on a standard assessed in term 1 in one school and term 4 in another are likely to vary because of when the material was taught rather than to any difference between the schools.

Accounting for differences

In the past, differences in the results across schools were often linked to the performance of the schools, departments, and teachers concerned. However, subjects and standards are not equal in difficulty (1), some schools, standards and subjects attract students of different abilities, and differently motivate students. Judgments in such cases require great care that like is being compared with like. While this is now better understood, cases persist in which comparisons are invalid or unfair. Judging whether different results for two standards signal any notable issues for a school's teaching programmes, requires comparison with the results for those standards across a range of schools with similar demographic characteristics.

To give some examples of unlike comparisons, it might often be the case that classes of students with differing levels of ability undertake somewhat different mixes of standards in a given subject area. Some schools may encourage all students to sit external standards; others may guide those of lesser ability to avoid them. Schools do attract students of differing ability, and 'league tables' may mistakenly rank a school lower than another school that gains better results only because it has a more able student body.

There are strategies to reduce these problems, but they don't remove them. For instance, some of the reports in the web statistics give results as percentages of the July 1 school rolls. But calculating percentages on the July 1 roll is a coarse approach, and may take no account of those who would never be candidates, nor of changes to the school population after the July 1 roll.

To help to address this problem, in 2009 a new report type is available that displays data on NCEA achievement based on an adjusted denominator, which includes only those students identified as being entered for sufficient credits to achieve a level of NCEA in the current year. We designed this report in response to schools' feedback on the accuracy of qualifications achievement statistics based on the traditional 1 July roll-based report. The effect of this new participation based denominator will be to remove from consideration all students who were not working towards a level of NCEA in that year.

One of the variables that has been shown to have an effect on the performance of students in assessments is their socioeconomic background. A school's decile provides some indication of the socioeconomic circumstances of students at that school, with higher deciles indicting higher socioeconomic levels. Therefore, any comparison of a school with other schools should at least be done on the basis of schools with similar decile ratings. Nonetheless the decile is a coarse indicator, and two schools with the same decile ratings might actually have quite different socioeconomic profiles.

A more meaningful approach for a school can be to choose a group of other schools with similar socioeconomic circumstances and demographics with which to compare its results. Often, teachers and principals have a better idea of which other schools have similar circumstances to their own than can be provided through official decile ratings or other administrative data.

1. A difficult standard undertaken by a strong cohort might have similar results to a less difficult standard undertaken by an average cohort.

Page updated: 08 May 2009