Wilson's Confidence Interval for 5 Star Rating

问题

Wilson's Confidence Interval takes as arguments the values TRUE or FALSE, or "upvotes" and "downvotes" respectively. From these votes it generates a rating.

For the purpose of my project, I think WCI is perfect. However, the scalar upvote and downvote is not enough to describe the thing I am rating.

That's where 5 star rating comes in, and this is where I need someone to disprove my logic. Now I'm thinking, if I were to implement a 5 star rating with WCI then the following should work without hacking the internals of the confidence interval.

For each star in the rating widget we assign a unique integer value. Each value either counts as a positive (upvote) or negative (downvote). So the following values would be:

1/5 stars: -2 2/5 stars: -1 3/5 stars: 1 4/5 stars: 2 5/5 stars: 3

To summarise the above values. The minimum vote of 1 star is classed as 2 downvotes. A vote of 2 stars is classed as 1 down vote. For the medium vote of 3 stars we give 1 upvote. For 4 stars we give 2 upvotes. And for the maximum of 5 stars we give 3 upvotes.

Please, disprove this logic, why won't this work? Maybe it goes against the "average person's understanding" of a star rating system?

回答1:

It's easy to think of the following 'workaround' which converts a multi-ranking system to the binary 'upvote/downvote'-style ranking (that can then be scored using the lower bound of Wilson score confidence interval):

Let's say you have the popular 5 star rating system. So we have a number of votes, each having a value of: 1, 2, 3, 4 or 5.

To 'convert' these ratings to up/down votes, use the following rule:

For star rating -- Add

*     - 0.00 to up votes and 1.00 to down votes (i.e. a full down vote)
**    - 0.25 to up votes and 0.75 to down votes
***   - 0.50 to up votes and 0.50 to down votes
****  - 0.75 to up votes and 0.25 to down votes
***** - 1.00 to up votes and 0.00 to down votes (i.e. a full up vote)

After we reduce the 5 star ratings to up/down ratings, we can proceed with the usual score calculations described in Evan Miller's article.

As I am not a statistician or mathematician and I would love to hear from other people if this makes sense or not and what might be the issues with this approach.

回答2:

First, try to understand what is the intuition behind WCI. Or, even simpler, Normal approximation interval ( http://en.wikipedia.org/wiki/Binomial_proportion_confidence_interval ).

The intuition behind all this interval calculation is simple. You calculate a sample mean and the standard deviation. Interval is mean+-z*std.

In your case calculating mean is simple. It is the mean of ratings itself. Assume p1 is the fraction of 1-star rating, p2,..., p5. p1+p2+...+p5 = 1. And assume you are calculating these stats using n samples. mean of your data is 1*p1+2*p2+...+5*p5.

The variance of your data is ( E(x^2)-(E(x))^2 )/n = ( (p1*1^2 + p2*2^2..+p5*5^2) - (1*p1+2*p2+..+5*p5)^2 )/n

Since std = sqrt(var), it is pretty straightforward to calculate Normal approximation interval. I will let you work on extending this to WCI.

回答3:

The biggest problem with this scheme is that a single 5-star rating will weigh as much as 3 2-star ratings. And also, an item with 300 3-star ratings (which should be a mediocre score) will have the same score as an item with 100 5-star ratings (which should be a perfect score).

What you could do is calculate a Wilson confidence interval for each possible score. The lower bound of each interval is then the weight of that score towards the (weighted) average.

来源：https://stackoverflow.com/questions/19613023/wilsons-confidence-interval-for-5-star-rating

标签

algorithm

rating