Skip to Content
  • Teacher Evaluation
  • Student Growth
  • Districts are facing hard choices: How can teacher evaluation help?

    February 13, 2025

    District leaders have to make many decisions about their teacher workforce, from staffing assignments to encouraging strong teachers to work in the hardest-to-staff schools or determining which teachers to lay off when budgets fall short. These decisions should not be made in a vacuum. Understanding which teachers are consistently high performing can help districts make more strategic decisions, and identifying where teachers need to improve can produce better results for everyone.

    However, evaluating and rating teachers has long proved to be complicated. Research suggests that strong evaluation requires more than just putting an evaluation system in place—it requires effective implementation.

    In this District Trendline, we draw from data in NCTQ’s Teacher Contract Database to analyze the teacher evaluation policies of large districts across the country and determine whether they align with the research on what will likely make for effective and impactful evaluation systems.

    As The Widget Effect argued many years ago, teachers are not interchangeable and should not be treated as if they were. Every teacher is important, and districts need to be able to distinguish among teachers, considering their strengths and limitations, for a range of reasons.

    Why might a district need to identify strong teachers?

    • To identify teacher leaders and mentor or cooperating teachers: Teachers who are effective at advancing student learning and display other leadership traits (e.g., the ability to effectively support and mentor their peers) may be the best candidates to take on additional roles and responsibilities. For example, cooperating teachers who are instructionally effective and host student teachers set those student teachers up to be far more effective. A robust evaluation system can offer a standardized and fair way of identifying teachers for these leadership roles.
    • To assign the strongest teachers to the students who need them most: Schools may want to assign the most effective teachers to the students who are struggling the most to promote excellent results for all students. Districts with more sophisticated evaluation systems may even be able to determine which teachers are most effective with different groups of students and make staffing assignments accordingly. Making these staffing determinations requires being able to measure teachers’ effectiveness, which could comprise both teachers’ impact on student academic growth as well as on factors like attendance and behavior.
    • To grant additional compensation to retain and attract teachers: Once a district knows which teachers are especially strong, it can pay those teachers more as a tool to encourage them to stay in the district or to encourage them to work in schools that are chronically understaffed or underperforming.

    Why might a district need to identify weak teachers?

    • To provide more intensive support for improvement: All teachers should receive guidance about how to become more effective as part of their evaluation, but some teachers need more support than others. Districts should use evaluation systems to identify those most in need of growth and provide them additional resources (rather than uniformly offering the same support to all teachers, especially when resources are limited) such as more intensive mentoring and targeted professional learning.
    • To make tenure decisions: In states that grant permanent status to teachers, it is critical for administrators to collect and use evaluation data to make informed decisions about awarding (or denying) tenure, a weighty decision for teachers’—and their students’—futures.
    • To inform layoffs: Unfortunately, many signs point to impending district layoffs as pandemic recovery funds run out. While many districts still rely on “last in, first out” policies, districts are increasingly recognizing that teachers’ effectiveness should be part of the decision so that schools do not lose strong early-career teachers.

    Five research-backed evaluation methods that help districts identify strong and weak performers

    1. Use multiple and frequent measures of teacher performance.

    Multiple measures

    An evaluation system that incorporates multiple measures of teachers’ effectiveness is associated with more consistent, stable ratings over time.

    Most districts have taken this to heart, with only 12% relying on a single component (most often “professional practice,” which may comprise several submeasures). More than a quarter of large districts (40) incorporate at least three components.

    Figure 1.

    The most common component is “professional practice” (included in 93% of districts’ evaluation systems), which often includes teachers’ observation ratings. Nearly three-quarters of districts (74%) include measures of student achievement or student growth; six of these districts include a school- or team-wide measure of student growth or achievement, which can be a powerful tool to promote team teaching. The next most common component is “professional growth or professional development” (26%), although other districts subsume this element into the “professional practice” component. A handful of districts (8%) include a component focused on stakeholder input (from parents, students, and others), and a few (3%) include a component focused on professional responsibilities.

    In 2023, NCTQ analyzed evidence on ways to measure teacher outcomes looking beyond assessment data. Some “nontest measures” included attendance (specifically, teachers’ effect on student attendance), a factor that is increasingly pressing as student and teacher attendance rates remain low. Notably, NCTQ did not identify any evaluation systems that consider teachers’ effect on student attendance, and only two (District of Columbia Public Schools [D.C.] and San Bernardino City Unified School District [CA]) consider teachers’ attendance rates.

     

    Multiple observations

    Just as teachers’ evaluation ratings are more reliable and holistic when they encompass multiple components, they should also be based on multiple observations. Basing ratings on multiple observations can ensure that one really bad (or unusually good) day does not overly influence a teacher’s overall rating. Here, districts vary widely, requiring anywhere from 1 observation per evaluation cycle for nontenured teachers to as many as 10. In one district, teachers who have had tenure for at least two years do not need to be observed even once. Not surprisingly, nontenured teachers tend to be observed more often than tenured teachers.

    Figure 2.

    Frequency of observations

    The number of observations should be interpreted within the context of how long an evaluation cycle lasts. Most large districts (74%) evaluate nontenured teachers annually and another 18% evaluate new teachers twice a year, but little more than half of districts (55%) require annual evaluations for tenured teachers. Many districts that require annual evaluations for nontenured teachers only require evaluations for tenured teachers every three or even five years. For example, in three districts, a tenured teacher may only be observed once or twice every five years. While research shows that teachers typically improve over time, that improvement cannot be taken as a given for every individual teacher. Regularly observing and evaluating teachers, even those with a long history in the district, can help ensure that students have a high-quality learning experience every year.

    2. Consider evaluators’ time and capacity.

    Observing teachers takes a great deal of time and expertise and may “crowd out” other responsibilities. Building that expertise (and teachers’ confidence in their evaluators) requires training evaluators on the observation rubric, selecting evaluators who have experience in and knowledge of the setting where teachers are being observed, and selecting evaluators who are familiar with the content teachers are teaching.

    Allowing people beyond the school’s administrators (e.g., other teachers or subject-matter experts in the district) to conduct observations brings several benefits, including reducing administrators’ time devoted to observations and ensuring that subject-matter experts can conduct observations.

    Large districts vary widely in whether they allow different observers. For example, 14 districts use peer observers (though only 11 use those peer observers for ratings). Some districts make peer observers optional, while a greater share of districts do not allow peer observers. Allowing teachers to observe and provide feedback to their peers can be a necessary step in establishing leadership roles for teachers as part of strategic staffing.

    Figure 3.

    Similarly, only three districts use a third-party observer from outside the school, typically only in the case of a teacher earning a low rating, and 41 make third-party observers optional, often allowing teachers to request them. Many districts have not established policies addressing these additional observers.

    Other strategies can also save evaluators time. Basing observations on video recordings eases the burden on observers, who can watch the videos based on their availability. This process also facilitates better feedback conversations and engenders greater trust from teachers. While NCTQ has not conducted a full scan of districts’ policies related to video recording lessons, we did identify one district that specifically allows them: Portland Public Schools (ME).

    Another way to ease the burden on observers is by allowing for shorter observations, or a mix of shorter and full-length observations. While not quite as informative as watching a full lesson, even 15-minute observations can tell school leaders a great deal about a teacher’s performance. While brief observations may be common for informal observations (which are often unannounced and may not come with formal feedback), they are not common for formal observations. Only 17 districts explicitly allow formal observations that are 15 to 20 minutes for tenured teachers (all but one also allow for brief observations for nontenured teachers). Many other districts have unclear policies with language such as, “must span enough time to reasonably assess teacher’s performance.”

    Figure 4.

    3. Address bias in the system head-on, iterating to make improvements.

    As the saying goes, to err is human—and teacher observations are no exception. Multiple studies have found evidence of bias in teacher observation ratings, including based on the race of the teacher being observed or the race of the students in their classroom.

    As described above, some districts allow for peer or third-party observers; having more observers from varied backgrounds may help to mitigate bias.

    Districts can also allow teachers to request a second evaluation when they have concerns about their initial observation ratings. While most districts do not address the right to request a new summative evaluation in documents NCTQ reviewed, a quarter of districts do allow teachers to request a second evaluation (this would go beyond just requesting an additional observation to allow for entirely new summative evaluation), often under specific circumstances, such as when they have earned a low evaluation rating. In some cases, they can request a second evaluation but may not be granted one. Only one district does not allow teachers to request a second evaluation.

    Some districts (30%) also offer a process for teachers to appeal low summative ratings. In some cases, the grievance process is limited to only tenured teachers or certain circumstances. Thirteen percent of districts allow teachers to grieve or appeal their summative rating in limited circumstances. For example, one allows teachers to appeal if they earn a low rating on student growth measure but earn “highly effective” on all other areas; another allows teachers to appeal a rating if they would lose tenure status. The appeals process varies by district; some districts engage a review committee or panel to revisit the evaluation rating. Occasionally, the teacher can even appeal the panel’s decision, for example, by bringing the issue to the superintendent.

    However, more than a third of districts (59 districts) have no clear processes in place to either request a second observation or address concerns with the overall evaluation rating.

    Figure 5.

    4. Tie the results of observations and evaluations directly to each teacher’s own customized professional learning.

    Teacher evaluation systems should be designed to help all teachers improve, regardless of where they fall along the spectrum of teacher effectiveness. One way this can happen is by guiding teachers toward relevant professional learning based on needs identified during their evaluations. A recent qualitative study found that principals who “implemented teacher evaluation robustly” focused on formative feedback and considered helping teachers improve to be the main purpose of evaluation. Further, research suggests that pairing performance pay programs with professional development results in greater gains in student learning compared to programs without a professional development connection.

    While NCTQ does not systematically collect data on how evaluation results inform professional learning opportunities, we do collect information about whether and how teachers receive feedback. Receiving feedback, ideally through both a conversation with the observer and written documentation, is an essential tool to identify specific areas in which teachers need additional support. Most districts, but not all, provide feedback based on teachers’ observations (86%) and final observation ratings (84%).

    5. Make the ratings meaningful by attaching stakes.

    Given limited resources, districts need to make hard decisions, including how to reward high-performing teachers and when to exit chronically low-performing ones.

    Targeting incentives to high-performing teachers can encourage them to stay in the classroom. In fact, robust teacher evaluation systems paired with performance pay can improve teacher retention, including in high-need schools.

    To offer higher pay to highly effective teachers, districts first need a rating system that distinguishes between teachers. In our sample, most districts (81%) employ a system with at least four rating categories, while 3% (five districts) still use a binary rating system.

    Next, districts must set policies that pay highly rated teachers more. This is still relatively uncommon; three in four districts do not address this issue in the documents NCTQ reviewed. However, 16% of districts offer a bonus or stipend, often limited to teachers working in specific schools, which can be an effective tool to attract highly effective teachers to hard-to-staff schools.1172 Also, some districts allow teachers with higher evaluation ratings to apply for additional roles and responsibilities through which they would earn higher pay.

    Ideally, every teacher who is hired is effective year after year, and those who struggle receive the support and guidance to improve. However, there will inevitably be cases where some teachers are consistently ineffective. In those instances, districts may need to dismiss a teacher to prevent further lost learning for students. Studies of evaluation systems in Chicago and Washington, D.C., found that when strong evaluation systems identified (and led to the exit of) low-performing teachers, those teachers were replaced by more effective teachers, on average.

    In the 148 districts that we reviewed, most districts allow (45%) or require (14%) dismissal if the teacher fails to improve on the final step of the district’s remediation process. However, many districts (40%) do not address this question in the documents NCTQ reviewed.

    In some cases, a teacher may need to receive two or more unsatisfactory ratings before being considered for dismissal. Fortunately, those districts that require multiple unsatisfactory ratings before taking action tend to evaluate teachers on an annual basis. However, a handful of districts (seven in our sample) evaluate tenured teachers only every three to five years and do not have a policy NCTQ could identify regarding the dismissal of chronically underperforming teachers.

    Conclusion

    As education leaders face the uncomfortable reality of teacher shortages and teacher layoffs, it’s even more important for districts to have a clear sense of their teaching talent. Education evaluation does not offer a panacea, but when designed and implemented carefully, it can be a valuable tool to inform critical staffing decisions to best support kids to learn and teachers to grow.


    More like this

    Endnotes
    1. Bleiberg, J., Brunner, E., Harbatkin, E., Kraft, M. A., & Springer, M. (2024). Taking teacher evaluation to scale: The effect of state reforms on achievement and attainment [EdWorkingPaper: 21-496]. Annenberg Institute at Brown University. https://doi.org/10.26300/b1ak-r251
    2. The sample for this analysis, drawn from NCTQ’s Teacher Contract Database, consists of 148 school districts in the United States: the 100 largest districts in the country, the largest district in each state, and the member districts of the Council of Great City Schools.
    3. Goldhaber, D., Krieg, J., & Theobald, R. (2020). Effective like me? Does having a more productive mentor improve the productivity of mentees? Labour Economics, 63, 101792.
    4. Wood, W. J., Lai, I., Filosa, N. R., Imberman, S. A., Jones, N. D., & Strunk, K. O. (2023). Are effective teachers for students with disabilities effective teachers for all? Educational Evaluation and Policy Analysis, 01623737231214555.
    5. Bueno, C., & Sass, T. R. (2019). The effects of differential pay on teacher recruitment and retention [Working paper no. 219-0519]. National Center for Analysis of Longitudinal Data in Education Research (CALDER).
    6. Morgan, A. J., Nguyen, M., Hanushek, E. A., Ost, B., & Rivkin, S. G. (2023). Attracting and retaining highly effective educators in hard-to-staff schools [No. w31051]. National Bureau of Economic Research.
    7. Cantrell, S. & Kane, T. J. (2013). Ensuring fair and reliable measures of effective teaching. The Bill & Melinda Gates Foundation.
    8. Several other districts embed student input within a broader category like “professional values.”
    9. Student attendance rates: Swiderski, T., Crittenden Fuller, S., & Bastian, K. C. (2024). Student-level attendance patterns show depth, breadth, and persistence of post-pandemic absenteeism. Brookings. https://www.brookings.edu/articles/student-level-attendance-patterns-show-depth-breadth-and-persistence-of-post-pandemic-absenteeism/; National Assessment Governing Board. (n.d.). A primer on attendance and chronic absenteeism on the nation’s report card and beyond.
      https://www.nagb.gov/naep/chronic-absenteeism.html
      Teacher attendance rates: Bloomberg & Querolo, N. (2024, June 6). Teacher absences are worse now than during the pandemic. It’s costing schools $4 billion a year and some students ‘will never get back on track.’ Fortune. https://fortune.com/2024/06/06/teacher-absences-are-worse-now-than-during-the-pandemic-its-costing-schools-4-billion-a-year-and-some-students-will-never-get-back-on-track/; Daly, T. (2024, March 15). Why are teachers missing so much school? Flypaper. Thomas B. Fordham Institute. https://fordhaminstitute.org/national/commentary/why-are-teachers-missing-so-much-school
    10. These teachers have to have “satisfactory performance,” but in the absence of an evaluation rating, it’s unclear what constitutes satisfactory performance.
    11. Donaldson, M. L., Mavrogordato, M., Youngs, P., & Dougherty, S. M. (2024). Principals’ priorities, teacher evaluation, and instructional leadership. Educational Researcher, 0013189X241273903.
    12. Steinberg, M. P., & Sartain, L. (2015). Does better observation make better teachers?. Education Next, 15(1). For examples of how training is considered standard in strong evaluation systems, see Garet, M. S., Wayne, A. J., Brown, S., Rickles, J., Song, M., & Manzeske, D. (2017). The impact of providing performance feedback to teachers and principals. NCEE 2018-4001. National Center for Education Evaluation and Regional Assistance; Kane, T. (2012). Capturing the dimensions of effective teaching: Student achievement gains, student surveys, and classroom observations. Education Next, 12(4).
    13. Kraft, M., & Christian, A. (2021). Can teacher evaluation systems produce high-quality feedback? An administrator training field experiment. American Educational Research Journal. https://www.edworkingpapers.com/sites/default/files/ai19-62_2.pdf
    14. Firestone, W., & Donaldson, L. (2019). Teacher evaluation as data use: What recent research suggests. Educational Assessment, Evaluation and Accountability, 31(3), 289–314; Kraft, M., & Gilmour, A. (2016). Can principals promote teacher development as evaluators? A case study of principals’ views and experiences. Educational Administration Quarterly, 52(5), 711–753. https://journals.sagepub.com/doi/abs/10.1177/0013161X16653445
    15. Kane, T. J., Blazar, D., Gehlbach, H., Greenberg, M., Quinn, D. M., & Thal, D. (2020). Can video technology improve teacher evaluations? An experimental study. Education Finance and Policy, 15(3), 397–427.
    16. Ho, A. D., & Kane, T. J. (2013). The reliability of classroom observations by school personnel [Research paper]. MET Project. The Bill & Melinda Gates Foundation.
    17. Steinberg, M. P., & Sartain, L. (2021). What explains the race gap in teacher performance ratings? Evidence from Chicago Public Schools. Educational Evaluation and Policy Analysis, 43(1), 60–82; Campbell, S. L., & Ronfeldt, M. (2018). Observational evaluation of teachers: Measuring more than we bargained for? American Educational Research Journal, 55(6), 1233–1267.
    18. Donaldson, M. L., Mavrogordato, M., Youngs, P., & Dougherty, S. M. (2024). Principals’ priorities, teacher evaluation, and instructional leadership. Educational Researcher, 0013189X241273903.
    19. Pham, L. D., Nguyen, T. D., & Springer, M. G. (2021). Teacher merit pay: A meta-analysis. American Educational Research Journal, 58(3), 527–566. https://doi.org/10.3102/0002831220905580
    20. Some additional districts may provide feedback depending on a teacher’s tenure status or the evaluation structure.
    21. Nguyen, T., Pham, L., Springer, M., & Crouch, M. (2019). The factors of teacher attrition and retention: An updated and expanded meta-analysis of the literature [EdWorkingPaper: 19-149]. Annenberg Institute at Brown University. https://edworkingpapers.com/ai19-149
    22. The remainder are either unclear, do not address this issue in the documents NCTQ reviewed, or do not rate teachers.
    23. Sartain, L., & Steinberg, M. (2021). Can personnel policy improve teacher quality? The role of evaluation and the impact of exiting low-performing teachers [EdWorkingPaper: 21-486]. https://doi.org/10.26300/d201-7y89
    24. Adnot, M., Dee, T., Katz, V., & Wyckoff, J. (2017). Teacher turnover, teacher quality, and student achievement in DCPS. Educational Evaluation and Policy Analysis, 39(1), 54–76; Walsh, E., & Dotter, D. (2014). Longitudinal analysis of the effectiveness of DCPS teachers [No. 40185.533]. Mathematica Policy Research.