I have a python script that creates a list of lists of server uptime and performance data, where each sub-list (or \'row\') contains a particular cluster\'s stats. For example,
You need to calculate the Mean (Average) and Standard Deviation for the column. Stadard deviation is a bit confusing, but the important fact is that 2/3 of the data is within
Mean +/- StandardDeviation
Generally anything outside Mean +/- 2 * StandardDeviation is an outlier, but you can tweak the multiplier.
http://en.wikipedia.org/wiki/Standard_deviation
So to be clear, you want to convert the data to standard deviations from the mean.
ie
def getdeviations(x, mean, stddev):
return math.abs(x - mean) / stddev
Numpy has functions for this.