Increasingly, studies have focused on gender and racial biases in the course evaluations students routinely complete at the end of a course. For example, this large-scale study of thousands of course evaluations from across a university over a seven-year period found evidence for both gender and cultural bias (defined as bias against faculty from a non-English speaking background, as determined by country of birth and/or language spoken at home). Overall, female faculty and faculty from non-English speaking backgrounds received lower ratings than their male or English-speaking counterparts. The effect was particularly pronounced for the intersectional analysis (female faculty from non-English speaking backgrounds compared to male faculty from English-speaking backgrounds) and within the Science faculty. Interestingly, these effects vanished when the researchers looked at the questions related to the course itself, as opposed to the instructor, suggesting that “biases creep in when students evaluate the person, not the course” (p. 10).
Various solutions have been proposed for mitigating bias in course evaluations, including making students more aware of the potential for bias in their responses; increasing the proportion of female and underrepresented faculty; focusing questions on student learning and specific instructional practices, rather than more broadly on the instructors themselves; and reducing or even eliminating the emphasis on course evaluations in tenure and promotion decisions.
A more recent study, reported on by Inside Higher Ed on February 17, 2021 (article here, link to download the full study here) compiled findings from more than 100 articles on bias in Student Evaluations of Teaching (SETs). Key findings include:
- SETs show evidence of measurement bias, such that courses with lighter workloads, those with more favorable grade distributions, non-elective courses, and upper-level discussion-based courses receive better scores from students
- Students tend to rate courses in the natural sciences lowest and humanities highest
- An instructor’s gender, race, ethnicity, accent, sexual orientation, or disability status all impact student ratings
- Male instructors are perceived as more accurate in their teaching, more educated, less sexist, more enthusiastic, competent, organized, easier to understand, prompt in providing feedback, and they are less penalized for being tough graders
- Both male and female students expect women and men to conform to prescribed gender roles; students seem to prefer professors with masculine traits and penalize women who don’t conform to feminine stereotypes – a sort of double jeopardy for female faculty
- Students show a “gender affinity” in which they prefer professors of the same gender as themselves
- Though there is evidence of bias within other categories of identity (particularly against Black, Asian, and Latinx faculty), the authors criticize the lack of research in this area, as well as the lack of intersectional research looking at the impact of multiple converging identities on SET bias
- The authors attribute this lack of research in “no small part [to] the underrepresentation of people of color among faculty [. . .] there are often too few people of color to make reasonable inferences from the data.”
- The authors offer recommendations for reforming the SET process, arguing that they should be used to contextualize students’ experiences in the classroom, not evaluate teaching
- They recommend using SETs not as a comparative metric across faculty, but to compare a faculty member’s own teaching “trajectory” over time
- Because the distribution of SET scores tend to skew negative, the median or modal response should be used instead of the mean
- Invitations for open-ended comments should be avoided, as these tend to produce the strongest evidence of bias; instead, students should respond to specific prompts
- Alternative and complementary measures of teaching effectiveness (e.g., peer evaluations, teaching portfolios, reviews of course materials) should be used in addition to SETs
- SETs should be used with the utmost caution in hiring, tenure, and promotion decisions; extra caution should be used for SETs collected during the COVID-19 era
Other articles (including some recent popular press articles from Inside Higher Ed and The Chronicle) related to this topic are included below:
- What if you can’t remove the bias from course evaluations? (Teaching Newsletter, The Chronicle of Higher Education, October 2024)
- Teaching Evaluations Are Broken. Can They Be Fixed? (The Chronicle of Higher Education, February 2024)
- Teaching Evaluations are Racist, Sexist, and Often Useless: It’s time to put these flawed measures in their place (The Chronicle of Higher Education, July 2023) – includes link to an extensive list of prior studies on bias in course evaluations
- Faculty Gender Imbalances Yield Biased Student Ratings (Inside Higher Ed, January 2023)
- Empowering Students Through Instructor Evaluations (Inside Higher Ed, April 2022)
- The Skinny on Teaching Evals and Bias (Inside Higher Ed, February 2021) – article summarized above
- Stroebe, W. (2020). Student evaluations of teaching encourages poor teaching and contributes to grade inflation: A theoretical and empirical analysis. Basic and Applied Social Psychology, 42(4), 276-294.
- Medina, M. S., Smith, W. T., Kolluru, S., Sheaffer, E. A., & DiVall, M. (2019). A review of strategies for designing, administering, and using student ratings of instruction. American Journal of Pharmaceutical Education, 83(5). doi: 10.5688/ajpe7177
- We Don’t Trust Course Evaluations, but Are Peer Observations of Teaching Much Better? (The Chronicle of Higher Education, June 2019)
- Peterson, D. A. M., Biederman, L. A., Andersen, D., Ditonto, T. M., & Roe, K. (2019). Mitigating gender bias in student evaluations of teaching. PLOS One, 14(5): e0216241.
- Anne Boring, Kellie Ottoboni and Philip B. Stark. Student evaluations of teaching (mostly) do not measure teaching effectiveness. ScienceOpen Research. Vol. 0(0):1-11. DOI: 10.14293/S2199-1006.1.SOR-EDU.AETBZC.v1
- MacNell, L., Driscoll, A., & Hunt, A. N. (2015). What’s in a name: Exposing gender bias in student ratings of teaching. Innovative Higher Education, 40, 291-303.
- Smith, B. P. & Hawkins, B. (2011). Examining student evaluations of Black college faculty: Does race matter? The Journal of Negro Education, 80(2), 149-162.
- Reid, L. D. (2010). The role of perceived race and gender in the evaluation of college teaching on RateMyProfessors.com. Journal of Diversity in Higher Education, 3(3), 137-152.