numpy covariance matrix

前端 未结 10 1614
半阙折子戏
半阙折子戏 2021-01-01 13:10

Suppose I have two vectors of length 25, and I want to compute their covariance matrix. I try doing this with numpy.cov, but always end up with a 2x2 matrix.



        
相关标签:
10条回答
  • 2021-01-01 13:47

    Try this:

    import numpy as np
    x=np.random.normal(size=25)
    y=np.random.normal(size=25)
    z = np.vstack((x, y))
    c = np.cov(z.T)
    
    0 讨论(0)
  • 2021-01-01 13:47

    As pointed out above, you only have two vectors so you'll only get a 2x2 cov matrix.

    IIRC the 2 main diagonal terms will be sum( (x-mean(x))**2) / (n-1) and similarly for y.

    The 2 off-diagonal terms will be sum( (x-mean(x))(y-mean(y)) ) / (n-1). n=25 in this case.

    0 讨论(0)
  • 2021-01-01 13:48

    What you got (2 by 2) is more useful than 25*25. Covariance of X and Y is an off-diagonal entry in the symmetric cov_matrix.

    If you insist on (25 by 25) which I think useless, then why don't you write out the definition?

    x=np.random.normal(size=25).reshape(25,1) # to make it 2d array.
    y=np.random.normal(size=25).reshape(25,1)
    
    cov =  np.matmul(x-np.mean(x), (y-np.mean(y)).T) / len(x)
    
    0 讨论(0)
  • 2021-01-01 13:52

    according the document, you should expect variable vector in column:

    If we examine N-dimensional samples, X = [x1, x2, ..., xn]^T
    

    though later it says each row is a variable

    Each row of m represents a variable.
    

    so you need input your matrix as transpose

    x=np.random.normal(size=25)
    y=np.random.normal(size=25)
    X = np.array([x,y])
    np.cov(X.T)
    

    and according to wikipedia: https://en.wikipedia.org/wiki/Covariance_matrix

    X is column vector variable
    X = [X1,X2, ..., Xn]^T
    COV = E[X * X^T] - μx * μx^T   // μx = E[X]
    

    you can implement it yourself:

    # X each row is variable
    X = X - X.mean(axis=0)
    h,w = X.shape
    COV = X.T @ X / (h-1)
    
    0 讨论(0)
提交回复
热议问题