How to calculate mean based on number of votes/scores/samples/etc?

心已入冬 提交于 2019-12-04 18:23:44

You could just give it a weighted score when ranking results, as opposed to just displaying the average vote so far, by multiplying with some function of the number of votes.

An example in C# (because that's what I happen to know best...) that could easily be translated into your language of choice:

double avgScore = Math.Round(sum / n);
double rank = avgScore * Math.Log(n);

Here I've used the logarithm of n as the weighting function - but it will only work well if the number of votes is neither too small or too large. Exactly how large is "optimal" depends on how much you want the number of votes to matter.

If you like the logarithmic approach, but base 10 doesn't really work with your vote counts, you could easily use another base. For example, to do it in base 3 instead:

double rank = avgScore * Math.Log(n, 3);

Which function you should use for weighing is probably best decided by the order of magnitude of the number of votes you expect to reach.

You could also use a custom weighting function by defining

double rank = avgScore * w(n);

where w(n) returns the weight value depending on the number of votes. You then define w(n) as you wish, for example like this:

double w(int n) {
    // caution! ugly example code ahead...
    // if you even want this approach, at least use a switch... :P

    if (n > 100) { 
        return 10; 
    } else if (n > 50) {
        return 8;
    } else if (n > 40) {
        return 6;
    } else if (n > 20) {
        return 3;
    } else if (n > 10) {
        return 2;
    } else {
        return 1;
    }
}

If you want to use the idea in my other referenced answer (thanks!) of using a pessimistic lower bound on the average then I think some additional assumptions/parameters are going to need to be injected.

To make sure I understand: With 10000 votes, every single one of which is "2", you're very sure the true average is 2. With 2 votes, each a "2", you're very unsure -- maybe some 0's and 1's will come in and bring down the average. But how to quantify that, I think is your question.

Here's an idea: Everyone starts with some "baggage": a single phantom vote of "1". The person with 2 true "2" votes will then have an average of (1+2+2)/3 = 1.67 where the person with 10000 true "2" votes will have an average of 1.9997. That alone may satisfy your criteria. Or to add the pessimistic lower bound idea, the person with 2 votes would have a pessimistic average score of 1.333 and the person with 10k votes would be 1.99948.

(To be absolutely sure you'll never have the problem of zero standard error, use two different phantom votes. Or perhaps use as many phantom votes as there are possible vote values, one vote with each value.)

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!