问题
For student council this year, I'm on the "songs" committee, we pick the songs. Unfortunately, the kids at the dances always end up hating some of the stupid song choices. I thought I could make it different this year. Last thursday, I created a simple PHP application so kids could submit songs into the database, supplying a song name, artist, and genre (from a drop-down). I also implemented a voting feature similar to Reddit's. Click an upvote button, you've upvoted the song, incremented the upvote count. Same with downvotes.
Anywho, in the database, I have three tidbits of information I thought I could use to rate these songs, upvotes, downvotes, and a timestamp. For a while, the rank was created by simply having the songs with the higher "vote" count at the top. That is, the more upvotes, less downvotes (upvotes - downvotes) would be at the top of the list. That worked, for a while, but there were about 75 songs on the list by Sunday, and the songs that were submitted first were simply at the top of the list.
Sunday, I changed the rank algorithm to (upvotes - downvotes) / (CurrentTimestamp - CreationTimestamp), that is, the higher the vote count in the lesser amount of time, the higher the song would be on the list. This works, better, but still not how i'd like it.
What happens now, is that the instant a song is created and upvoted to a vote count of 1, it ends up at the top of the list somewhere. Songs who have vote counts in the negatives aren't viewed often because kids don't usually scroll to the bottom.
I guess I could sort the data so the lower songs appear at the top, so people are forced to see the lower songs. Honestly, I've never had to work on a "popularity" algorithm before, so, what are your thoughts?
Website's at http://www.songs.taphappysoftware.com - I don't know if I should put this here or not, might cause some unwanted songs at the dance :0
回答1:
That's a very good question. There are a few similar questions that have been asked here.
This article is probably a good place to start. Apparently upvotes minus downvotes is a bad way to do it. The better way is to use complicated maths to assign a score to each and sort by that.
Here is a scoring function in Ruby from the article:
require 'statistics2'
def ci_lower_bound(pos, n, power)
if n == 0
return 0
end
z = Statistics2.pnormaldist(1-power/2)
phat = 1.0*pos/n
(phat + z*z/(2*n) - z * Math.sqrt((phat*(1-phat)+z*z/(4*n))/n))/(1+z*z/n)
end
pos
is the number of positive rating,n
is the total number of ratings, andpower
refers to the statistical power: pick 0.10 to have a 95% chance that your lower bound is correct, 0.05 to have a 97.5% chance, etc.
As a usability thing, I would sort the data by the score, but I would not show the score to the user. I would only show the number of upvotes and downvotes.
回答2:
How about sorting songs by posting time or number of votes (negative + positive)? If your goal is to give every song equal attention, this sounds good enough.
来源:https://stackoverflow.com/questions/3705537/sorting-a-list-of-songs-by-popularity