Monday, June 23, 2014

Increasing diversity by ditching the GRE


Every Fall seniors in the US take the Graduate Records Examination (GRE), and their scores are submitted along with their applications to grad school. Many professors, particularly those in physics departments, believe that the GRE is an important predictor of future success in grad school, and as a result many admissions committees employ score cutoffs in the early stages of their selection process. However, past and recent studies have shown that there is little correlation between GRE scores and future graduate school success.

The most recent study of this type was recently published in Nature Jobs. The authors, Casey Miller and Keivan Stassun show there are strong correlations between GRE scores and race/gender, with minorities and (US) white women scoring lower than their white male (US) counterparts. They conclude, "In simple terms, the GRE is a better indicator of sex and skin colour than of ability and ultimate success."

Here's the key figure from their article:




As the chair of the admissions committee for two years while I was at Caltech, and having served on admissions committees for the past five years, I can attest that these offsets and correlations persist for the Physics GRE as well. Indeed, over the past 20 years, applications to the Caltech Astronomy graduate program show a persistent 80 point offset between men and women applicants from the US. 
GRE scores of male (black) and female (red) applicants to the Caltech
Astronomy graduate program. The histograms have been normalized
such that the peak bin is unity, for clarity. The dashed lines indicate
the median score for male and female applicants (740 and 660, respectively)
There are two possible reasons for this offset: either men and women are fundamentally different in their physics abilities, or GRE scores are not testing what we think they're testing. There are many reasons to favor the latter over the former. First, the male/female divide does not persist from applicants all countries. For example, the gap does not persist for Chinese and Indian applicants, nor does it show up in applications from most Eastern European countries. Indeed, women from China and India score significantly and consistently  higher than US women on the Physics GRE. (While I have hard data for the male/female offset in US grad applicants, I don't have access to my original dataset. Thus, this last point is based on my recollection of top applicants/scores.)

Another reason is that there are no scientific studies to back the assertion of male/female cognitive differences, and certainly none to explain the observed differences in GRE scores. This is assuming that GRE scores are indicative of the cognitive abilities that matter for success in graduate school, which is highly debatable, as explained in the Nature article cited above, and in other studies. Here's a figure showing the relationship (or lack thereof) between GRE Physics scores and performance in graduate course work in the Harvard Physics program, based on a study from 1996:


Then why are GRE scores such strong predictors of gender (and race/ethnicity)? The reason has been known to psychology researchers for decades, and it is known as the phenomenon of stereotype threat (or identity threat). The basic idea is that if the identity of a test-taker corresponds to a group of people who are stereotypically "bad" at a the skills being tested, they will subconsciously experience additional stress to not conform to that stereotype. If the student taking the Physics GRE is a US woman, she is from a society that has explicitly or implicitly taught her that women are poor at math and physics. As a result, when she sits down to take the exam, she is not only aiming for a good score for her own grad school admission prospects, but she is also under pressure not to perform as poorly as society expects her to. This additional stress has been demonstrated to cause deleterious physiological and cognitive effects in test takers. 

But before you are tempted to attribute stereotype threat to a weakness inherent to racial and gender minorities, note that the phenomenon can be triggered in white men. From the abstract of Aaronson et al. (1999):
The two experiments reported in this paper demonstrate that stereotype threat is a general phenomenon that can be experienced by members of any group depending on context. In Experiment 1, White males with high math SAT scores took a difficult math test. In one condition, students were given information suggesting that Asians typically outperform other students in math. Moreover, the students in this condition were told that the study was designed to identify the nature and scope of differences in performance between Asians and other groups in mathematics. In a second control condition there was no mention of Asians, only information suggesting that the task was designed to assess mathematical ability. Participants in the first condition performed significantly worse than students in the control condition. Experiment 2 replicated this finding but also showed moderation by identification with mathematics; only those students who were highly identified with mathematics performed more poorly under stereotype threat. These studies show that stereotype threat can undermine performance of any individual who has a strong identity in a domain when context highlights stereotypes suggestive of relatively poor performance in that domain.
If you are interested in learning more about identity threat, I encourage you to read Claude Steele's clear and entertaining book on the subject, Whistling Vivaldi, and/or check out this compendium of over 300 peer-reviewed journal articles

There's not much more to say about the GRE. It is a deeply flawed metric of assessing future success in graduate school, and in my opinion as an astronomy professor, it should be dropped from the admissions process entirely. We at Harvard have downgraded the importance of GRE test scores in our admissions process and the quality and diversity of our admitted students has increased as a result. Other schools around the country are doing or considering the same. I'll conclude with the conclusion of Miller & Stassun:
Let us be frank: we believe that many STEM faculty members on admissions committees and upper-level administrators hold a deep-seated and unfounded belief that these test scores are good measures of ability, of potential for doing well in graduate school and of long-term potential as a scientist, and that students who score poorly on standardized exams are not likely to become PhD-level scientists. These assumptions are false. 
This is not a call to admit unqualified students in the name of social good. This is a call to acknowledge that the typical weight given to GRE scores in admissions is disproportionate. If we diminish reliance on GRE and instead augment current admissions practices with proven markers of achievement, such as grit and diligence [link added by blogger], we will make our PhD programmes more inclusive and will more efficiently identify applicants with potential for long-term success as researchers. Isn't that what graduate school is about?

No comments :