Self- and peer assessments are becoming more popular in classrooms, but there are few data on the reliability and validity of such assessments performed by school children. Because these factors are greatly affected by the number of raters, we conducted two studies to determine the rating behaviours of teenagers in self- and peer assessments, and how the number of raters influences the reliability and validity of self- and peer assessments. The first study involved 116 seventh graders (the first grade of middle school), where students individually playing musical recorders were subject to self- and peer assessments. The second study involved 110 eighth graders, with Web pages constructed by students being subject to self- and peer assessments. Generalizability theory and criterion-related validity were used to obtain the reliability and validity coefficients of the self- and peer ratings. Analyses of variance were used to compare differences in self- and peer ratings between low- and high-achieving students. The coefficients of reliability and validity increased with the number of raters in both studies, reaching the acceptable levels of 0.80 and 0.70, respectively, with 3 or 4 raters in the first study (involving assessments of individual performance) and with 14-17 raters in the second study (involving assessments of group work). Furthermore, low- and high-achieving students tended to over- and underestimate the quality of their work in self-assessment, respectively. The discrepancy between the ratings of students and experts was higher in group-work assessments then in individual-work assessments. The results have both theoretical and practical implications for researchers and teachers.
ASJC Scopus subject areas