I\'ve been looking at ranking algorithms recently, specifically those used by Reddit and Hacker News. The algorithms themselves are simple enough, but I don\'t quite understand
Reddit uses Pyrex, the sort algorithm is a Python C extension to improve performance.
So, you can do the same in SQL when the record is updated, pex: when is up or down voted.
The pseudocode you must to translate to your SQL engine syntax:
function hot(ups, downs, date){
score = ups - downs;
order = log(max(abs(score), 1), 10);
if (score>0){
sign = 1;
} else {
if (score<0){
sign = -1;
} else {
sign = 0;
}
}
td = date - datetime(1970,1,1);
seconds = td.days * 86400 + td.seconds + (float(td.microseconds) / 1000000) - 1134028003;
return round(order + sign * seconds / 45000, 7);
}
So you must to store in the post table the ups, downs, date and the hot function result. And then you can make a sort in the hot column.
You can see the Reddit source code here: http://code.reddit.com/
I implemented an SQL version of Reddit's ranking algorithm for a video aggregator like so:
SELECT id, title
FROM videos
ORDER BY
LOG10(ABS(cached_votes_total) + 1) * SIGN(cached_votes_total)
+ (UNIX_TIMESTAMP(created_at) / 300000) DESC
LIMIT 50
cached_votes_total is updated by a trigger whenever a new vote is cast. It runs fast enough on our current site, but I am planning on adding a ranking value column and updating it with the same trigger as the cached_votes_total column. After that optimization, it should be fast enough for most any size site.
edit: More information at Reddit Hotness Algorithm in SQL