One of the big complaints about the confluence of social media and e-commerce is the potential for companies to try to game the system by stacking business profiles on sites like Yelp with fake
reviews -- including unfounded positive reviews for their own companies, and malicious critical reviews of competitors. Inevitably, this has given rise to a whole new branch of online security,
aiming to sniff out the fake comments.
This week brought news that four students from Cornell University have created software to identify fake online hotel reviews (okay, I admit I have
an ulterior motive in writing about this: Cornell is in my hometown of Ithaca). The Cornell project brought together three computer science students and a communications major, who described their
efforts to combat "opinion spam" in a report titled "Finding Deceptive Opinion Spam by Any Stretch of the Imagination."
Overall the Cornellians claim their
software is able to spot fake reviews 90% of the time, versus just 50% for human subjects, focusing on fake positive reviews and irrelevant comments which, say, post links to other Web sites for
promotional purposes. With this focus in mind, the authors say roughly half of all online hotel reviews are fake -- four times the proportion guessed by human subjects (who estimated fake reviews
as 12% of all reviews).
Unsurprisingly, one of the key giveaways for fake (and real) reviews is word choice, and the project uncovered some interesting trends here. For example, you are
more likely to find words like "hotel," "my," "experience," "vacation," and the names of cities in fake reviews. Meanwhile real reviews were more likely to
contain words like "floor," "bathroom," "small," and the "$" sign. Of course, now that the fake review writers know this, they can populate their bogus
comments with these "authentic" words. Just be on your guard if you see a review that says "Great floor! Loved the small bathroom! $!"