问题
Wilson's Confidence Interval takes as arguments the values TRUE or FALSE, or "upvotes" and "downvotes" respectively. From these votes it generates a rating.
For the purpose of my project, I think WCI is perfect. However, the scalar upvote and downvote is not enough to describe the thing I am rating.
That's where 5 star rating comes in, and this is where I need someone to disprove my logic. Now I'm thinking, if I were to implement a 5 star rating with WCI then the following should work without hacking the internals of the confidence interval.
For each star in the rating widget we assign a unique integer value. Each value either counts as a positive (upvote) or negative (downvote). So the following values would be:
1/5 stars: -2 2/5 stars: -1 3/5 stars: 1 4/5 stars: 2 5/5 stars: 3
To summarise the above values. The minimum vote of 1 star is classed as 2 downvotes. A vote of 2 stars is classed as 1 down vote. For the medium vote of 3 stars we give 1 upvote. For 4 stars we give 2 upvotes. And for the maximum of 5 stars we give 3 upvotes.
Please, disprove this logic, why won't this work? Maybe it goes against the "average person's understanding" of a star rating system?
回答1:
It's easy to think of the following 'workaround' which converts a multi-ranking system to the binary 'upvote/downvote'-style ranking (that can then be scored using the lower bound of Wilson score confidence interval):
Let's say you have the popular 5 star rating system. So we have a number of votes, each having a value of: 1, 2, 3, 4 or 5.
To 'convert' these ratings to up/down votes, use the following rule:
For star rating -- Add
* - 0.00 to up votes and 1.00 to down votes (i.e. a full down vote)
** - 0.25 to up votes and 0.75 to down votes
*** - 0.50 to up votes and 0.50 to down votes
**** - 0.75 to up votes and 0.25 to down votes
***** - 1.00 to up votes and 0.00 to down votes (i.e. a full up vote)
After we reduce the 5 star ratings to up/down ratings, we can proceed with the usual score calculations described in Evan Miller's article.
As I am not a statistician or mathematician and I would love to hear from other people if this makes sense or not and what might be the issues with this approach.
回答2:
First, try to understand what is the intuition behind WCI. Or, even simpler, Normal approximation interval ( http://en.wikipedia.org/wiki/Binomial_proportion_confidence_interval ).
The intuition behind all this interval calculation is simple. You calculate a sample mean and the standard deviation. Interval is mean+-z*std.
In your case calculating mean is simple. It is the mean of ratings itself. Assume p1 is the fraction of 1-star rating, p2,..., p5. p1+p2+...+p5 = 1. And assume you are calculating these stats using n samples. mean of your data is 1*p1+2*p2+...+5*p5.
The variance of your data is ( E(x^2)-(E(x))^2 )/n = ( (p1*1^2 + p2*2^2..+p5*5^2) - (1*p1+2*p2+..+5*p5)^2 )/n
Since std = sqrt(var), it is pretty straightforward to calculate Normal approximation interval. I will let you work on extending this to WCI.
回答3:
The biggest problem with this scheme is that a single 5-star rating will weigh as much as 3 2-star ratings. And also, an item with 300 3-star ratings (which should be a mediocre score) will have the same score as an item with 100 5-star ratings (which should be a perfect score).
What you could do is calculate a Wilson confidence interval for each possible score. The lower bound of each interval is then the weight of that score towards the (weighted) average.
来源:https://stackoverflow.com/questions/19613023/wilsons-confidence-interval-for-5-star-rating