However, evaluating and rating teachers has long proved to be complicated. Research suggests that strong evaluation requires more than just putting an evaluation system in place—it requires effective implementation.1150
In this District Trendline, we draw from data in NCTQ's Teacher Contract Database to analyze the teacher evaluation policies of large districts across the country1151 and determine whether they align with the research on what will likely make for effective and impactful evaluation systems.
As The Widget Effect argued many years ago, teachers are not interchangeable and should not be treated as if they were. Every teacher is important, and districts need to be able to distinguish among teachers, considering their strengths and limitations, for a range of reasons.
Why might a district need to identify strong teachers?
- To identify teacher leaders and mentor or cooperating teachers: Teachers who are effective at advancing student learning and display other leadership traits (e.g., the ability to effectively support and mentor their peers) may be the best candidates to take on additional roles and responsibilities. For example, cooperating teachers who are instructionally effective and host student teachers set those student teachers up to be far more effective.1152 A robust evaluation system can offer a standardized and fair way of identifying teachers for these leadership roles.
- To assign the strongest teachers to the students who need them most: Schools may want to assign the most effective teachers to the students who are struggling the most to promote excellent results for all students. Districts with more sophisticated evaluation systems may even be able to determine which teachers are most effective with different groups of students and make staffing assignments accordingly.1153 Making these staffing determinations requires being able to measure teachers' effectiveness, which could comprise both teachers' impact on student academic growth as well as on factors like attendance and behavior.
- To grant additional compensation to retain and attract teachers: Once a district knows which teachers are especially strong, it can pay those teachers more as a tool to encourage them to stay in the district1154 or to encourage them to work in schools that are chronically understaffed or underperforming.1155
Why might a district need to identify weak teachers?
- To provide more intensive support for improvement: All teachers should receive guidance about how to become more effective as part of their evaluation, but some teachers need more support than others. Districts should use evaluation systems to identify those most in need of growth and provide them additional resources (rather than uniformly offering the same support to all teachers, especially when resources are limited) such as more intensive mentoring and targeted professional learning.
- To make tenure decisions: In states that grant permanent status to teachers, it is critical for administrators to collect and use evaluation data to make informed decisions about awarding (or denying) tenure, a weighty decision for teachers'—and their students'—futures.
- To inform layoffs: Unfortunately, many signs point to impending district layoffs as pandemic recovery funds run out. While many districts still rely on "last in, first out" policies, districts are increasingly recognizing that teachers' effectiveness should be part of the decision so that schools do not lose strong early-career teachers.
Five research-backed evaluation methods that help districts identify strong and weak performers
1. Use multiple and frequent measures of teacher performance.
Multiple measures
An evaluation system that incorporates multiple measures of teachers' effectiveness is associated with more consistent, stable ratings over time.1156Most districts have taken this to heart, with only 12% relying on a single component (most often "professional practice," which may comprise several submeasures). More than a quarter of large districts (40) incorporate at least three components.
The most common component is "professional practice" (included in 93% of districts' evaluation systems), which often includes teachers' observation ratings. Nearly three-quarters of districts (74%) include measures of student achievement or student growth; six of these districts include a school- or team-wide measure of student growth or achievement, which can be a powerful tool to promote team teaching. The next most common component is "professional growth or professional development" (26%), although other districts subsume this element into the "professional practice" component. A handful of districts (8%) include a component focused on stakeholder input (from parents, students, and others),1157 and a few (3%) include a component focused on professional responsibilities.
In 2023, NCTQ analyzed evidence on ways to measure teacher outcomes looking beyond assessment data. Some "nontest measures" included attendance (specifically, teachers' effect on student attendance), a factor that is increasingly pressing as student and teacher attendance rates remain low.1158 Notably, NCTQ did not identify any evaluation systems that consider teachers' effect on student attendance, and only two (District of Columbia Public Schools [D.C.] and San Bernardino City Unified School District [CA]) consider teachers' attendance rates.
Multiple observations
Just as teachers' evaluation ratings are more reliable and holistic when they encompass multiple components, they should also be based on multiple observations. Basing ratings on multiple observations can ensure that one really bad (or unusually good) day does not overly influence a teacher's overall rating. Here, districts vary widely, requiring anywhere from 1 observation per evaluation cycle for nontenured teachers to as many as 10. In one district, teachers who have had tenure for at least two years do not need to be observed even once.1159 Not surprisingly, nontenured teachers tend to be observed more often than tenured teachers.Frequency of observations
The number of observations should be interpreted within the context of how long an evaluation cycle lasts. Most large districts (74%) evaluate nontenured teachers annually and another 18% evaluate new teachers twice a year, but little more than half of districts (55%) require annual evaluations for tenured teachers. Many districts that require annual evaluations for nontenured teachers only require evaluations for tenured teachers every three or even five years. For example, in three districts, a tenured teacher may only be observed once or twice every five years. While research shows that teachers typically improve over time, that improvement cannot be taken as a given for every individual teacher. Regularly observing and evaluating teachers, even those with a long history in the district, can help ensure that students have a high-quality learning experience every year.2. Consider evaluators' time and capacity.
Observing teachers takes a great deal of time and expertise and may "crowd out" other responsibilities.1160 Building that expertise (and teachers' confidence in their evaluators) requires training evaluators on the observation rubric,1161 selecting evaluators who have experience in and knowledge of the setting where teachers are being observed,1162 and selecting evaluators who are familiar with the content teachers are teaching.1163Allowing people beyond the school's administrators (e.g., other teachers or subject-matter experts in the district) to conduct observations brings several benefits, including reducing administrators' time devoted to observations and ensuring that subject-matter experts can conduct observations.
Large districts vary widely in whether they allow different observers. For example, 14 districts use peer observers (though only 11 use those peer observers for ratings). Some districts make peer observers optional, while a greater share of districts do not allow peer observers. Allowing teachers to observe and provide feedback to their peers can be a necessary step in establishing leadership roles for teachers as part of strategic staffing.
Similarly, only three districts use a third-party observer from outside the school, typically only in the case of a teacher earning a low rating, and 41 make third-party observers optional, often allowing teachers to request them. Many districts have not established policies addressing these additional observers.
Other strategies can also save evaluators time. Basing observations on video recordings eases the burden on observers, who can watch the videos based on their availability. This process also facilitates better feedback conversations and engenders greater trust from teachers.1164 While NCTQ has not conducted a full scan of districts' policies related to video recording lessons, we did identify one district that specifically allows them: Portland Public Schools (ME).
Another way to ease the burden on observers is by allowing for shorter observations, or a mix of shorter and full-length observations. While not quite as informative as watching a full lesson, even 15-minute observations can tell school leaders a great deal about a teacher's performance.1165 While brief observations may be common for informal observations (which are often unannounced and may not come with formal feedback), they are not common for formal observations. Only 17 districts explicitly allow formal observations that are 15 to 20 minutes for tenured teachers (all but one also allow for brief observations for nontenured teachers). Many other districts have unclear policies with language such as, "must span enough time to reasonably assess teacher's performance."
3. Address bias in the system head-on, iterating to make improvements.
As the saying goes, to err is human—and teacher observations are no exception. Multiple studies have found evidence of bias in teacher observation ratings, including based on the race of the teacher being observed or the race of the students in their classroom.1166As described above, some districts allow for peer or third-party observers; having more observers from varied backgrounds may help to mitigate bias.
Districts can also allow teachers to request a second evaluation when they have concerns about their initial observation ratings. While most districts do not address the right to request a new summative evaluation in documents NCTQ reviewed, a quarter of districts do allow teachers to request a second evaluation (this would go beyond just requesting an additional observation to allow for entirely new summative evaluation), often under specific circumstances, such as when they have earned a low evaluation rating. In some cases, they can request a second evaluation but may not be granted one. Only one district does not allow teachers to request a second evaluation.
Some districts (30%) also offer a process for teachers to appeal low summative ratings. In some cases, the grievance process is limited to only tenured teachers or certain circumstances. Thirteen percent of districts allow teachers to grieve or appeal their summative rating in limited circumstances. For example, one allows teachers to appeal if they earn a low rating on student growth measure but earn "highly effective" on all other areas; another allows teachers to appeal a rating if they would lose tenure status. The appeals process varies by district; some districts engage a review committee or panel to revisit the evaluation rating. Occasionally, the teacher can even appeal the panel's decision, for example, by bringing the issue to the superintendent.
However, more than a third of districts (59 districts) have no clear processes in place to either request a second observation or address concerns with the overall evaluation rating.
4. Tie the results of observations and evaluations directly to each teacher's own customized professional learning.
Teacher evaluation systems should be designed to help all teachers improve, regardless of where they fall along the spectrum of teacher effectiveness. One way this can happen is by guiding teachers toward relevant professional learning based on needs identified during their evaluations. A recent qualitative study found that principals who "implemented teacher evaluation robustly" focused on formative feedback and considered helping teachers improve to be the main purpose of evaluation.1167 Further, research suggests that pairing performance pay programs with professional development results in greater gains in student learning compared to programs without a professional development connection.1168While NCTQ does not systematically collect data on how evaluation results inform professional learning opportunities, we do collect information about whether and how teachers receive feedback. Receiving feedback, ideally through both a conversation with the observer and written documentation, is an essential tool to identify specific areas in which teachers need additional support. Most districts, but not all, provide feedback based on teachers' observations (86%) and final observation ratings (84%).1169
5. Make the ratings meaningful by attaching stakes.
Given limited resources, districts need to make hard decisions, including how to reward high-performing teachers and when to exit chronically low-performing ones.Targeting incentives to high-performing teachers can encourage them to stay in the classroom. In fact, robust teacher evaluation systems paired with performance pay can improve teacher retention, including in high-need schools.1170
To offer higher pay to highly effective teachers, districts first need a rating system that distinguishes between teachers. In our sample, most districts (81%) employ a system with at least four rating categories, while 3% (five districts) still use a binary rating system.1171
Next, districts must set policies that pay highly rated teachers more. This is still relatively uncommon; three in four districts do not address this issue in the documents NCTQ reviewed. However, 16% of districts offer a bonus or stipend, often limited to teachers working in specific schools, which can be an effective tool to attract highly effective teachers to hard-to-staff schools.1172 Also, some districts allow teachers with higher evaluation ratings to apply for additional roles and responsibilities through which they would earn higher pay.
Ideally, every teacher who is hired is effective year after year, and those who struggle receive the support and guidance to improve. However, there will inevitably be cases where some teachers are consistently ineffective. In those instances, districts may need to dismiss a teacher to prevent further lost learning for students. Studies of evaluation systems in Chicago1173 and Washington, D.C.,1174 found that when strong evaluation systems identified (and led to the exit of) low-performing teachers, those teachers were replaced by more effective teachers, on average.
In the 148 districts that we reviewed, most districts allow (45%) or require (14%) dismissal if the teacher fails to improve on the final step of the district's remediation process. However, many districts (40%) do not address this question in the documents NCTQ reviewed.
In some cases, a teacher may need to receive two or more unsatisfactory ratings before being considered for dismissal. Fortunately, those districts that require multiple unsatisfactory ratings before taking action tend to evaluate teachers on an annual basis. However, a handful of districts (seven in our sample) evaluate tenured teachers only every three to five years and do not have a policy NCTQ could identify regarding the dismissal of chronically underperforming teachers.