Why do numpy cov diagonal elements and var functions have different values?

前端 未结 1 868
天涯浪人
天涯浪人 2020-12-31 10:46
In [127]: x = np.arange(2)

In [128]: np.cov(x,x)
Out[128]:
array([[ 0.5,  0.5],
       [ 0.5,  0.5]])

In [129]: x.var()
Out[129]: 0.25

Why is thi

相关标签:
1条回答
  • 2020-12-31 11:38

    In numpy, cov defaults to a "delta degree of freedom" of 1 while var defaults to a ddof of 0. From the notes to numpy.var

    Notes
    -----
    The variance is the average of the squared deviations from the mean,
    i.e.,  ``var = mean(abs(x - x.mean())**2)``.
    
    The mean is normally calculated as ``x.sum() / N``, where ``N = len(x)``.
    If, however, `ddof` is specified, the divisor ``N - ddof`` is used
    instead.  In standard statistical practice, ``ddof=1`` provides an
    unbiased estimator of the variance of a hypothetical infinite population.
    ``ddof=0`` provides a maximum likelihood estimate of the variance for
    normally distributed variables.
    

    So you can get them to agree by taking:

    In [69]: cov(x,x)#defaulting to ddof=1
    Out[69]: 
    array([[ 0.5,  0.5],
           [ 0.5,  0.5]])
    
    In [70]: x.var(ddof=1)
    Out[70]: 0.5
    
    In [71]: cov(x,x,ddof=0)
    Out[71]: 
    array([[ 0.25,  0.25],
           [ 0.25,  0.25]])
    
    In [72]: x.var()#defaulting to ddof=0
    Out[72]: 0.25
    
    0 讨论(0)
提交回复
热议问题