At first glance, it may not seem that the umpires of Major League Baseball and the regulators at the US Food and Drug Administration have much in common. But look again, says Professor Jerry Kim. “Umpires are baseball’s regulators. It’s their job to judge quality to make sure the integrity of the game is sound in an objective, unbiased way. That’s parallel to what the FDA or, for that matter, any gatekeeper or critic does.”
Kim’s research into the pharmaceutical industry showed evidence that decisions of presumably objective regulators at the FDA were influenced by the status and reputation of the pharmaceutical firms: better-known, higher-status firms get faster approval for drugs similar to those that lower-status firms were bringing to market at the same time. But there’s no definitive way to tell whether the higher-status firms really were better or were just coasting on reputation, because drug quality has many facets, from how effective a drug is, to whether it interacts poorly with other commonly-prescribed drugs, and so on.
That led Kim, along with co-researcher Brayden King of Northwestern University, to look for a way to confirm whether perceptions of quality really are influenced by perceptions of reputation and status — including social factors such as the relationship between the evaluator and evaluatee, and what they already know about each other.
Major League Baseball (MLB) offered the perfect opportunity to see if and how reputational bias works. MLB has four high-speed cameras installed in each League stadium. The cameras take 25 snapshots of each pitch, capturing the speed and spin rate of each pitch from different angles and recording where in the strike zone each pitch lands. This data — for almost 800,000 pitches from almost 5,000 games in 2008 and 2009 — gave the researchers objective measures of quality that they could compare to umpires’ actual calls, which they then compared with player stats and All-Star appearances.
They found that status, as measured by the average number of All-Star appearances (per year) the pitcher had made, clearly influenced umpires’ calls. About 15 percent of all calls were mistakes, but the rate at which the umpires made bad calls varied depending on the status of the pitcher on the mound, even after for controlling for other factors thought to influence calls, including whether the pitcher is on the home team, game attendance, inning, and ball count. For a five-time All Star, umpires were about 16 percent more likely to expand the zone and mistakenly call a ball a strike than they would for a typical player who had no All-Star appearances. Similarly, a five-time All Star pitcher was about 9 percent less likely to have the umpire mistakenly miss a pitch that was in the strike zone.
Overall, both reputation and status have comparable effects, and players with great reputations and high status enjoy an interaction effect. “All-Star status alone isn't always great, but pitchers whose past performance aligns with their reputations enjoy a boost,” Kim says. “Pitchers like Greg Maddux who have a good reputation for throwing strikes and having precise control who were also high status benefitted the most from umpires’ biases.” In contrast, a player with high status who also had a reputation for throwing a lot of balls and overall less control saw the status effect diminished — these pitchers weren’t as likely to benefit from an umpire’s mistake at the same rate.
Kim explains these results as a function of expectations on the part of the umpires. “When an umpire sees a high-status pitcher, he assumes that player is high quality. He tends to expect a strike, and so he’s more likely to see a strike.” Similarly, a pitcher with a reputation for throwing strikes consistently also triggers similar expectations.
Other researchers looking at similar questions have focused on why prominent figures — be they big banks, pharmaceutical firms, sports figures, or movie stars — get favorable treatment by asking if it’s because they are better quality, or if others defer to them out of self-interest. But Kim says that the umpires’ judgments are too instantaneous to allow them to make any self-interested calculations. “If it’s conscious, you would expect that in important situations where they might expect more scrutiny, umpires would make far fewer of these biased calls,” Kim says. “We find the exact opposite: the more critical the situation – think bottom of the ninth, tied game, runner on second — the more umpires made biased calls during these critical times. That suggests they’re relying on unconscious biases and expectations more, not less.”
The implications extend beyond baseball. “This mechanism is a metaphor for what we do in every day life, with regulators and different kinds of gatekeepers and experts. Consider the long history of women scientists being ignored because they're women. Even if they do great research, the established experts automatically discount them because they don’t expect great research from women,” Kim says. “If this is in fact a cognitive phenomenon, driven by unconscious expectations, and people expect you to perform poorly, then there’s the somewhat unfair and banal advice to just work harder.” Ultimately, low-status players — in any field — with good performance find themselves handicapped by comparison, while high-status players may find themselves rewarded even when they are undeserving, Kim says. “In either direction, those advantages and disadvantages add up.”
Jerry Kim is assistant professor of management at Columbia Business School.
Read the Research
Kim, Jerry, and Brayden King. “Seeing Stars: Matthew Effects and Status Bias in Major League Baseball Umpiring.” Management Science (forthcoming).