SQL - STDEVP or STDEV and how to use it?

后端 未结 3 1837
灰色年华
灰色年华 2020-12-30 19:37

I have a table:

LocationId OriginalValue Mean
1          0.45         3.99  
2          0.33         3.99
3          16.74        3.99
4          3.31                


        
相关标签:
3条回答
  • 2020-12-30 20:17

    Generally, you should use STDEV when you have to estimate standard deviation based on a sample. But if you have entire column-data given as arguments, then use STDEVP.

    In general, if your data represents the entire population, use STDEVP; otherwise, use STDEV.

    Note that for large samples, the functions return nearly the same value, so better use STDEV in this case.

    0 讨论(0)
  • 2020-12-30 20:23

    To use it, simply:

    SELECT STDEVP(OriginalValue)
    FROM yourTable
    

    From below, you probably want STDEVP.

    From here:

    STDEV is used when the group of numbers being evaluated are only a partial sampling of the whole population. The denominator for dividing the sum of squared deviations is N-1, where N is the number of observations ( a count of items in the data set ). Technically, subtracting the 1 is referred to as "non-biased."

    STDEVP is used when the group of numbers being evaluated is complete - it's the entire population of values. In this case, the 1 is NOT subtracted and the denominator for dividing the sum of squared deviations is simply N itself, the number of observations ( a count of items in the data set ). Technically, this is referred to as "biased." Remembering that the P in STDEVP stands for "population" may be helpful. Since the data set is not a mere sample, but constituted of ALL the actual values, this standard deviation function can return a more precise result.

    0 讨论(0)
  • 2020-12-30 20:31

    In statistics there are two types of standard deviations: one for a sample and one for a population. The sample standard deviation, generally notated by the letter s, is used as an estimate of the population standard deviation. The population standard deviation, generally notated by the Greek letter lower case sigma, is used when the data constitutes the complete population. It is difficult to answer your question directly -- sample or population -- because it is difficult to tell what you are working with: a sample or a population. It often depends on context. Consider the following example. If I want to know the standard deviation of the age of students in my class, then I u=would use STDEVP because the class is my population. But if I want the use my class as a sample of the population of all students in the school (this would be what is known as a convenience sample, and would likely be biased, but I digress), then I would use STDEV because my class is a sample. The resulting value would be my best estimate of STDEVP. As mentioned above (1) for large sample sizes (say, more than thirty), the difference between the two becomes trivial, and (2) generally you should use STDEV, not STDEVP, because in practice we usually don't have access to the population. Indeed, one could argue that if we always had access to populations, then we wouldn't need statistics. The entire point of inferential statistics is to be able to make inferences about a population based on the sample.

    0 讨论(0)
提交回复
热议问题