It has been said that value-added is the worst form of teacher evaluation, except for all the others that have been tried.
In a new paper just out from Brookings this month,
They take exception to reports that routinely decry the misidentification of average teachers as ineffective by value-added measures. All evaluation systems, they observe, make classification errors, including the system for evaluating teachers circa 2010.
More focus needs to be directed to the large numbers of ineffective teachers incorrectly identified as effective. The picture below, modified from one in the paper and reason enough to read it, describes a hypothetical teacher evaluation system with just two categories: "effective" and "ineffective" (binary systems like this are still dominant in teacher evaluation systems across the country). A hypothetical manager decides to set a cut-score based on value-added data: teachers with value-added scores above the cut-score will be considered effective while those below it will be considered ineffective.
Discussions about value-added too often focus on Region A, which is the group of "truly effective" teachers incorrectly identified as ineffective by the value-added cut-score methodology, the false negatives. The authors, however, point to Region B, which is the group of "truly ineffective" teachers incorrectly identified as effective, the false positives. Decreasing the cut-score reduces the number of false positives, but increases the number of false negatives. Most teacher evaluation systems now have cut-points effectively set all the way over to the left, producing way too many false positives.
Mislabeling effective teachers primarily impacts individual teachers, which is indeed unfortunate. However, mislabeling ineffective teachers inserts a multiplier of 20 to 25 into the unfairness quotient. These errors impact whole classrooms of children.