In online reviews, patterns in vocabulary can betray deceit

September 12, 2011 at 5:02 pm

My room at the Hotel Monaco in Chicago was small, but not cramped. There’s a decent attached restaurant and a free evening wine hour in the ornate — yet cozy, thanks to the working fireplace — lobby. The hotel is a few blocks from the convention center, ideal for a reporter covering a scientific conference. I know, because I was there. But how do you know I was there?

Look closely at my review. Online reviews are littered with linguistic clues that separate legitimate reviews from the fakes, new research reveals. And we should thank the academics who are looking into it (I was not paid to review them favorably), because there’s a huge financial incentive and little cost for businesses to get into the fraudulent review game.

So on to the red flags. Certain words — not necessarily ones you’d expect — are signs of lies: luxury, husband, I, business. Others indicate truth: on, bathroom, near, small, but, yet.

Take a look at this review of the Monaco:

I stayed at the Monaco for the Labor day weekend when I visited my family in Chicago. It is one of the nicest hotels i have stayed at in my life. clean, comfortable and pretty. The rooms were clean and the staff is very caring. I needed some more glasses as my family was sharing drink. The front desk had them sent up in less than 10 minutes. I will recommend Monaco to anyone who will be staying in the Chicago area.

Lies, lies and more lies.

Don’t feel bad if you can’t see the distinction. When Cornell University researchers had human beings judge which of 800 reviews of Chicago area hotels were genuine, the humans didn’t do much better than chance. But three computer-based approaches all fared better than the humans. And a combo of two computational approaches did so with nearly 90 percent accuracy, Myle Ott and colleagues Yejin Choi, Claire Cardie and Jeffrey Hancock report online July 22 at arXiv.org.

Ott and colleagues recruited 400 people to each write one fake (but realistic) review portraying the hotel in a positive light. The team also gathered 400 real reviews of Chicago hotels from TripAdvisor (and in case they accidently collected fake reviews, the analysis was repeated with reviews from Hotels.com, which allows reviews only by people who have booked a hotel through the site).

Some phrases echo previous work identifying speech patterns that indicate deception or truth. But other findings contrast with the standard psychological view of deceptive speech, suggesting that there is no single deception signature — and that liars may switch things up depending on context.

Truthful reviews tended to have spatial language (floor, bathroom, small), prepositions (which usually concern space: beside, across, on) and nouns yielding concrete descriptions:

The room I had was on the 23rd floor and was like a suite, with a living area and a bedroom. The living room was spacious, with a plasma TV, a desk and a couch. The beds were very comfortable and the toiletries of very good quality. In the closet they placed an umbrella, which came in handy, it rained the whole time I was in Chicago.

It makes sense that truthful reviews have more spatial words and solid nouns, says Ott, because spatial details are difficult to fake. It’s hard to tell a good specific lie about a place you have never been. Deceptive reviews, on the other hand, contained more filler words, text related to external things, not the hotel itself: business, vacation, husband.

My husband and I stayed at the Omni after attending a wedding that took place there. We were delighted at the luxury of the rooms and the accommodations were wonderful. Everyone from the concierge to the housekeepers were friendly and professional. We were extremely pleased with the whole experience and look forward to our next trip to Chicago so we can stay there again. And, by the way, the wedding was absolutely gorgeous!

The wedding? There’s a hint of deception. Extremely pleased? Absolutely gorgeous? Come on. Superlatives are another red flag. Deceptive writing often contains exaggerated language. And it rarely includes caveats — words that exclude, like but and yet.

This research isn’t just an interesting contribution to understanding the psychology of deception. It also might be a useful filter for review websites such as Yelp or TripAdvisor, says Ott.

Of course, once the word is out, spammers will probably switch up their game, says informatics expert Filippo Menczer of Indiana University in Bloomington. “It’s an arms race. The marks get smarter and the fraudsters do too,” says Menczer, whose recent research has focused on how spammers use social networking tools such as Twitter.

“Social media has lowered the price of committing fraud,” he says. “It was much more expensive in the days when you had to stand by the Colosseum all day and try and sell it to tourists.”

SN Prime | September 12, 2011 | Vol. 1, No. 13