Reverse Engineering the Yelp Review Filter Algorithm
Yelp Review Filter Has Significantly Damaged (and Helped?) Local Businesses
We have heard from a very large number of disgruntled local companies that have been negatively impacted by the Yelp review filter. The typical story is:
- We got a bad review on Yelp.
- We asked our clients to post (hopefully, positive) reviews on Yelp.
- The good reviews get filtered and are not displayed.
- Web searchers only see the bad review(s).
- We are losing business.
In an open forum like Yelp or Google+, there is no validation of legitimacy. Would the bad reviews pass arbitration? How many reviews are actually legitimate? Who knows? It really does not matter. Open review forums are here to stay. These forums do hurt legitimate businesses when targeted by false, fake or exaggerated reviews with a negative agenda. At the same token, the Yelp review filter has (significantly?) helped businesses when it works correctly by only showing the legitimate reviews. In reality, the Yelp review filter is overly aggressive. It should not be filtering out such a large amount of legitimate reviews. Google+ also filters review, but the filter is not nearly as aggressive. This aggressive filtering is, without a doubt, unfairly hurting businesses and providing an easy method for unscrupulous business owners to degrade their competition.
The Yelp review filter is just another algorithm to make sense of the millions of pieces of information on its website. It is not human. It is just math. It relies on counting certain indicators or variables to make an assessment. Its “reference mode” can be deciphered and reverse engineered like any other model.
“Cracking” the Yelp Review Filter Algorithm
Having worked with algorithms for the past 13 years in operations research and vehicle route optimization, reverse engineering them and their behavior is seemingly mystifying, but not without patterns. After researching filtered and unfiltered reviews on Yelp and comparing to my previous reviews (all of which that were not filtered), I think that I may be closer to understanding the Yelp review filtering algorithm. These inferences are not based on statistical research, but gut feelings, experience, some quantification, and some research from another article by a former Yelp employee on the Yelp Review Filter algorithm. There was also a few research studies recently, but I did not read them. If you just look at the filtered and unfiltered Yelp reviews, these patterns emerge.
- The unfiltered reviews are often lengthy, at least 100 words and 500 characters with spaces.
- The unfiltered reviews often have a historical context like a story/experience, mentioning timelines, time frames, date or month references.
- The reviewer often has at least a few reviews already completed. If you observe the filtered reviews, you can see that many/most are from new users with few reviews (and, conversely, frequent posters get less filtered).
- The reviewer may typically be an active Yelp user, not necessarily making reviews, but being logged in to Yelp and researching companies. This was evident for myself, as I have only made a few reviews that were all posted, but often I use the Yelp mobile app to look up businesses, mainly restaurants. According to the ex-employee article, this is the most important component of the review filter. The article explained that Yelp looks at the IP address of users. If you never search on Yelp for businesses in the same geographic area (based on IP) that you were posting a review, it could be filtered. It would also seem likely that a person that logs into their company page on Yelp and then logs in as a user from the same IP to post a review would get filtered. The article also mentioned that having Liked a business on Facebook, your review would be filtered. I can’t believe that this would be true, as it would not make sense to disqualify a review for that reason, i.e., it is common to Like a company on FB, for which, you would leave a review.
These algorithm rules do not appear to be applied in every case nor need to be completely adhered to, e.g., an active Yelp user could have a short one sentence review that does not get filtered. A new Yelp user could have a negative review posted. This “negative review preference” may be another potential algorithm guideline, negative reviews do not get filtered as much. This may confer with the research presented in this article that 1 and 5 star reviews get filtered more. In my case, I have left one 1 star review and six 5 star reviews, all of which were not filtered (but I do use Yelp fairly often). I have a hunch that leaving lower star reviews and not only leaving 5 star reviews may also determine not being filtered.
Another aspect that I would assume to be a component of Yelp’s and Google+’s filtering would be the frequency of reviews. If a business gets a flood of reviews, they get filtered. I have noticed this on Google+. I advised an MMA gym owner to get more reviews from his students. Legitimate reviews significantly help local SEO/Google map rankings. Of course, they flowed in after he mentioned it to hist students. Google+ initially did not post them, but then unfiltered them a few months later. They were all legitimate reviews, just all within a short time frame.
If you are going to spend the time to write a review and you are doing it on Yelp, you will want that review to be actually posted. Otherwise, you are wasting your time to compliment or chastise the company that you are reviewing and contribute to the online community. I hope that these observations, assessments and “light” research findings shed light on how the Yelp review filter seemingly behaves.