Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The old Latin proverb "Quis custodiet ipsos custodes?"

http://en.wikipedia.org/wiki/Quis_custodiet_ipsos_custodes%3...

might in this context be paraphrased to "Who is rating the raters?" The hope in any online rating system is that enough people will come forward to rate something that you care about so that the people who have crazy opinions will be mere outliers among the majority of raters who share your well informed opinions. But how do you ever know that when you see an online rating of something that you haven't personally experienced?

Amazon has had star ratings for a long time. I largely ignore them. I read the reviews. For mathematics books (the thing I shop for the most on Amazon), I look for people writing reviews who have read other good mathematics books and who compare the book I don't know to books I do know. If an undergraduate student whines, "This book is really hard, and does a poor job of explaining the subject" while a mathematics professor says, "This book is more rigorous than most other treatments of the subject," I am likely to conclude that the book is a good book, ESPECIALLY if I can find comments about it being a good treatment of the subject on websites that review several titles at once, as for example websites that advise self-learners on how to study mathematics.

The problem with any commercial website with ratings (Amazon, Yelp, etc., etc.) is that there is HUGE incentive to game the ratings. Authors post bad ratings for books by other authors. The mother and sister and cousins of a restaurant owner post great ratings for their relative's restaurant, and lousy ratings for competing restaurants. I usually have no idea what bias enters into an online rating. So I try to look for the written descriptions of the good or service being sold, and I try to look for signals that the rater isn't just making things up and really knows what the competing offerings are like. When I am shopping for something, I ask my friends (via Facebook, often enough) for their personal recommendations of whatever I am shopping for. Online ratings are hopelessly broken, because of lack of authentication of the basis of knowledge of the raters, so minor details of dimensions of rating or of data display are of little consequence for improving online ratings.



> The problem with any commercial website with ratings (Amazon, Yelp, etc., etc.) is that there is HUGE incentive to game the ratings.

While I agree that this is a problem, I think a bigger problem is a simple matter of scale:

Amazon is huge, and many people buy things, but they don't split reviews and ratings by what kind of person is rating them. If they wanted to make an improvement, why not show me only ratings and reviews by people who are similar to me? They have tons of data about me and other people who use the service, so it should be possible for them to say "people like you rated this on average a 4, but everyone in the world rates it an average of 2.5."

That's much easier than having to read all the reviews and decide if the person is in my demographic or whether I agree with their review.


Like Netflix


In my experience Netflix ratings are total garbage. IMDB and rottentomato ratings are both way more accurate


> "Who is rating the raters?"

Netflix does. By cross referencing your likes and dislikes against those of your fellow Netflix members, the company is able to create a meta rating system, in which the score you see for a movie is your own. You see that score because that's how much Netflix thinks you'll like it, based on how similar people liked it.

This is the only good way of going about this method. The trick is, it's easy to do this with movies, but much more difficult with product ratings and the like. Maybe this is an opportunity for someone to build something on top of Facebook or Amazon.


Pandora does something similar. They only have "like" and "dislike" rating. If you like a song they look at other users who liked that song and try to find more songs/bands from the people with your taste. And the other way around for dislike i guess.

It works exceptionally well. You just listen to the stream of incoming songs, you never pick songs yourself. After a good song you click like, after a mediocre song you keep listening, during a bad song you click skip. After a few days you will only get good songs! (with a few exceptions of course) It's like magic, i can't even count how many new bands i found through pandora without any effort.

Too bad it doesn't work outside US anymore. :(


Can the same rating calculation used on Hacker News?


Is Netflix really the only site which does this nearly-braindead machine learning approach?

Once you realize that people have different tastes and you know someone's preferences that is the obvious solution. Or is the process of crawling through that much statistical data that expensive that it can only be offered to paying subscribers?


The more accurate you want to get, the more computationally expensive. Netflix actually did a contest with a million dollar prize to the team that could come up with the most accurate rating prediction algorithm. In the end, the million dollar algorithm was too expensive to implement, so they never ended up using it.


rateyourmusic.com sort of has this, but it's not in the default view. You have to go to an album and click a "View my suggested rating" button, then it whirs for a few seconds before giving you an average of ratings from users with similar taste. It would be much more useful to browse the whole site with those ratings showing, but I get the feeling it's a computationally expensive feature.


Re: Your watchmen quote

I think Amazon's "Was this rating helpful (Yes/No)?" provides a good filter for ratings. A lot of mindlessly negative reviews get filtered out by the users who come along afterwards and rate the rating in their own self-interest.


That can be easily gamed as well. If you want to boost a rating for a book, mark all of the negative comments as not helpful... In fact, I see that happen on Amazon and Newegg a lot.


They have a solution for that -- on Amazon (at least) to filter for the most helpful unfavorable reviews.


Regarding the unhelpfulness of online reviews, my company has problems with manufacturers/sellers writing 5-star reviews of their own product listings (ASIN's) on Amazon. We've begun (manually) data mining 5-star reviews to identify whether each 5-star-reviewer has any other reviews (or wish list, to indicate the possibility of a real user account), then calculating the % of reviews written by no-history user accounts. Of the ASIN's we've assessed, the gut-level-doesn't-seem-like-heavy-review-fraud listings can be in the 6% range, whereas the looks-like-review-fraud ASIN's are above 20%. We're working with Amazon to identify and penalize these manufacturers/sellers, but internally at Amazon the Seller Performance team is separate from their Community (user review) team, so it presents a challenge. Also hard for them to separate valid complaints from sour grapes complaints.


"but internally at Amazon the Seller Performance team is separate from their Community (user review)"

Indeed, my suspicion is that organizational politics have more to do with the lack of a better rating system than any technical limitation.

The approach I use is to read 3 star ratings first before biasing myself with the more extreme ratings. I also check to see what else the reviewer has rated and if there's nothing there then I immediately dismiss the review.


Relative ratings are more useful. Everyone uses their own scale, but their ratings are relative to the constant movie. I want to see how people rated a movie relative to other movies I've watched.

N people rated this better than X movie, but less than Y movie.

Ranking movies can be easy. Show 5 movie posters instead of 5 stars or have an auto-complete field for this movie is up there with:.


Anyone interested in relative ratings should look at Dan Areily's[1] Predictably Irrational. He is an economist who writes about behavioral economics and decision making.

I've often thought about some start-up ideas around relative ratings, and this book was the reason

[1] - http://danariely.com/


In the context of a website like Netflix, where your recommendations and ratings for movies are based off of your history of ratings, aren't you the one rating the raters?

However, the type of rating mentioned in the OP and the type of rating on Netflix only seem to work in specific niches. I can't imagine how a website like Amazon would implement anything even close to what Goodfilms is doing.


Many reviewers are biased and judgemental. I prefer to look at the distribution of ratings that amazon shows. Good items have a distribution with one peak, even if it isn't at 5 stars. When ratings create a saddle the product is probably a fluke.


The cute XKCD comic aside, the distribution is also useful because, for certain types of things, a lot of fairly to very negative comments are illuminating even if the average rating is still pretty high. It's not just about polarizing material. If you look, for example, at genre fiction you'll get a lot of fans who give 5s no matter what sort of crap the current book is. But if there are also a notable number of 1s and 2s, that's often a good red flag.


IMO the perfect solution would be to rate only by saying "I like" or "I dislike". And then writing a review (to show how much you liked/disliked the product/movie etc...)

When you first get on the website, you're asked to rate a number of products. The more you rate, the more accurate the solution becomes.

When it can couple your test with users having the same test, you know see these users ratings in priority for new products. You can even ask them why they disliked/liked something if they didn't write a review. Because their opinion matters to you now.


I'd just like to point out that bias feeds into written reviews as well. I may think good service is someone not refilling my water every 5 seconds where as you might think it's that your food didn't get out in under 10 minutes.

While words do express the point better, this rating system is a step in the right direction.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: