Remove mean from numpy matrix

后端 未结 4 1276
渐次进展
渐次进展 2021-02-07 02:47

I have a numpy matrix A where the data is organised column-vector-vise i.e A[:,0] is the first data vector, A[:,1] is the second and so on

相关标签:
4条回答
  • 2021-02-07 03:01

    You can also use matrix instead of array. Then you won't need to reshape:

    >>> A = np.matrix([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])
    >>> m = A.mean(axis=1)
    >>> A - m
    matrix([[-1.,  0.,  1.],
            [-1.,  0.,  1.],
            [-1.,  0.,  1.],
            [-1.,  0.,  1.]])
    
    0 讨论(0)
  • 2021-02-07 03:11

    As is typical, you can do this a number of ways. Each of the approaches below works by adding a dimension to the mean vector, making it a 4 x 1 array, and then NumPy's broadcasting takes care of the rest. Each approach creates a view of mean, rather than a deep copy. The first approach (i.e., using newaxis) is likely preferred by most, but the other methods are included for the record.

    In addition to the approaches below, see also ovgolovin's answer, which uses a NumPy matrix to avoid the need to reshape mean altogether.

    For the methods below, we start with the following code and example array A.

    import numpy as np
    
    A = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])
    mean = A.mean(axis=1)
    

    Using numpy.newaxis

    >>> A - mean[:, np.newaxis]
    array([[-1.,  0.,  1.],
           [-1.,  0.,  1.],
           [-1.,  0.,  1.],
           [-1.,  0.,  1.]])
    

    Using None

    The documentation states that None can be used instead of newaxis. This is because

    >>> np.newaxis is None
    True
    

    Therefore, the following accomplishes the task.

    >>> A - mean[:, None]
    array([[-1.,  0.,  1.],
           [-1.,  0.,  1.],
           [-1.,  0.,  1.],
           [-1.,  0.,  1.]])
    

    That said, newaxis is clearer and should be preferred. Also, a case can be made that newaxis is more future proof. See also: Numpy: Should I use newaxis or None?

    Using ndarray.reshape

    >>> A - mean.reshape((mean.shape[0]), 1)
    array([[-1.,  0.,  1.],
           [-1.,  0.,  1.],
           [-1.,  0.,  1.],
           [-1.,  0.,  1.]])
    

    Changing ndarray.shape directly

    You can alternatively change the shape of mean directly.

    >>> mean.shape = (mean.shape[0], 1)
    >>> A - mean
    array([[-1.,  0.,  1.],
           [-1.,  0.,  1.],
           [-1.,  0.,  1.],
           [-1.,  0.,  1.]])
    
    0 讨论(0)
  • 2021-02-07 03:11

    Looks like some of these answers are pretty old, I just tested this on numpy 1.13.3:

    >>> import numpy as np
    >>> a = np.array([[1,1,3],[1,0,4],[1,2,2]])
    >>> a
    array([[1, 1, 3],
           [1, 0, 4],
           [1, 2, 2]])
    >>> a = a - a.mean(axis=0)
    >>> a
    array([[ 0.,  0.,  0.],
           [ 0., -1.,  1.],
           [ 0.,  1., -1.]])
    

    I think this is much cleaner and simpler. Have a try and let me know if this is somehow inferior than the other answers.

    0 讨论(0)
  • 2021-02-07 03:21

    Yes. pylab.demean:

    In [1]: X = scipy.rand(2,3)
    
    In [2]: X.mean(axis=1)
    Out[2]: array([ 0.42654669,  0.65216704])
    
    In [3]: Y = pylab.demean(X, axis=1)
    
    In [4]: Y.mean(axis=1)
    Out[4]: array([  1.85037171e-17,   0.00000000e+00])
    

    Source:

    In [5]: pylab.demean??
    Type:           function
    Base Class:     <type 'function'>
    String Form:    <function demean at 0x38492a8>
    Namespace:      Interactive
    File:           /usr/lib/pymodules/python2.7/matplotlib/mlab.py
    Definition:     pylab.demean(x, axis=0)
    Source:
    def demean(x, axis=0):
        "Return x minus its mean along the specified axis"
        x = np.asarray(x)
        if axis == 0 or axis is None or x.ndim <= 1:
            return x - x.mean(axis)
        ind = [slice(None)] * x.ndim
        ind[axis] = np.newaxis
        return x - x.mean(axis)[ind]
    
    0 讨论(0)
提交回复
热议问题