How Much Should We Trust Online Ratings?

How the sequential nature of online reviews affects product ratings.
Print this page
Based on research by Omar Besbes and Marco Scarsini

How much should we trust online ratings? Ubiquitous across the internet on websites from Amazon to Rate My Professors, online ratings exert substantial influence over consumer decisions. Just a single star increase on Yelp has been found to have a 5 percent to 9 percent increase in product revenue.

But in a recent working paper, Professor Omar Besbes, working with Marco Scarsini of the Singapore University of Technology and Design, examined how the sequential nature of reviews distorts the statistics of ratings (such as the average or their distribution) from what might be called the “true” statistics — those one would observe if consumers submitted their reviews all at once, without reviewing previously submitted reviews.

Because online reviews for products and services are typically reported sequentially, rather than in parallel, new reviews come in all the time and not all at once. Most consumers, before submitting a review, are therefore exposed to reviews by others that have already been posted, and tend to look at this history — particularly at the average review, often synthesized in a star or number ranking — before writing their own. Previous research has shown that a review tends to reflect not just a consumer’s personal opinion, but also the influence of reviews that preceded it.

The researchers analyze a broad class of consumer reporting mechanisms that account for past reviews, dividing those into two behavioral categories: compensating behavior and herding behavior. Compensating consumers award an inflated high score or an exaggeratedly low score in an attempt to shift the average of ratings toward the rating they believe the product deserves. Herding consumers, on the other hand, follow the prevailing opinion, shifting their own personal rating toward the average in the belief that the crowd must be right.

Whether consumers are herding or compensating, the average of reported ratings for a product will stabilize over time; however, the researchers found, this average rating might be above or below the “true” average, a difference they define as the bias gap. Compensating and herding behavior have very different effects on this gap, the researchers’ model showed.

Compensating tends to have limited influence on where a score eventually stabilizes, herding can be very significant; While the average rating is highly influenced by the sequential nature of reviews, the position of a product or a service relative to competitors is usually preserved.

The researchers further analyze the potential for manipulation of reviews and show that compensating behavior limits the impact of review manipulation. However, when herding behavior occurs, a few very early reviews can dramatically shift the trajectory of a product’s ratings toward the high or low extremes.

This research highlights the difficulties of determining whether the true value or quality of a product or service is being reflected in the marketplace. Reviews have become an increasingly important tool for consumers to decide on their purchases but their accuracy is hard to track. Reviews are often bought (written for hire) or influenced (perks, discounts, or refunds are offered in exchange for writing a positive review or agreeing not to post a negative review).

With practical applications for marketing managers and operations managers, this research highlights the need to better understand how existing reviews are synthesized by consumers and the need to devise mechanisms to limit biases and manipulation in reviews.

Omar Besbes is the Philip H. Geier Jr. Associate Professor of Business in the Decision, Risk, and Operations Division at Columbia Business School.

About the researcher

Omar Besbes

Omar Besbes's primary research interests are in the area of data-driven decision-making with a focus on applications in e-commerce, pricing and revenue management...

Read more.