How to add an extra column to a NumPy array

前端 未结 17 1904
一个人的身影
一个人的身影 2020-11-22 14:37

Let’s say I have a NumPy array, a:

a = np.array([
    [1, 2, 3],
    [2, 3, 4]
    ])

And I would like to add a column of ze

相关标签:
17条回答
  • 2020-11-22 15:27

    I find the following most elegant:

    b = np.insert(a, 3, values=0, axis=1) # Insert values before column 3
    

    An advantage of insert is that it also allows you to insert columns (or rows) at other places inside the array. Also instead of inserting a single value you can easily insert a whole vector, for instance duplicate the last column:

    b = np.insert(a, insert_index, values=a[:,2], axis=1)
    

    Which leads to:

    array([[1, 2, 3, 3],
           [2, 3, 4, 4]])
    

    For the timing, insert might be slower than JoshAdel's solution:

    In [1]: N = 10
    
    In [2]: a = np.random.rand(N,N)
    
    In [3]: %timeit b = np.hstack((a, np.zeros((a.shape[0], 1))))
    100000 loops, best of 3: 7.5 µs per loop
    
    In [4]: %timeit b = np.zeros((a.shape[0], a.shape[1]+1)); b[:,:-1] = a
    100000 loops, best of 3: 2.17 µs per loop
    
    In [5]: %timeit b = np.insert(a, 3, values=0, axis=1)
    100000 loops, best of 3: 10.2 µs per loop
    
    0 讨论(0)
  • 2020-11-22 15:29

    I think a more straightforward solution and faster to boot is to do the following:

    import numpy as np
    N = 10
    a = np.random.rand(N,N)
    b = np.zeros((N,N+1))
    b[:,:-1] = a
    

    And timings:

    In [23]: N = 10
    
    In [24]: a = np.random.rand(N,N)
    
    In [25]: %timeit b = np.hstack((a,np.zeros((a.shape[0],1))))
    10000 loops, best of 3: 19.6 us per loop
    
    In [27]: %timeit b = np.zeros((a.shape[0],a.shape[1]+1)); b[:,:-1] = a
    100000 loops, best of 3: 5.62 us per loop
    
    0 讨论(0)
  • 2020-11-22 15:32

    np.concatenate also works

    >>> a = np.array([[1,2,3],[2,3,4]])
    >>> a
    array([[1, 2, 3],
           [2, 3, 4]])
    >>> z = np.zeros((2,1))
    >>> z
    array([[ 0.],
           [ 0.]])
    >>> np.concatenate((a, z), axis=1)
    array([[ 1.,  2.,  3.,  0.],
           [ 2.,  3.,  4.,  0.]])
    
    0 讨论(0)
  • 2020-11-22 15:33

    I like JoshAdel's answer because of the focus on performance. A minor performance improvement is to avoid the overhead of initializing with zeros, only to be overwritten. This has a measurable difference when N is large, empty is used instead of zeros, and the column of zeros is written as a separate step:

    In [1]: import numpy as np
    
    In [2]: N = 10000
    
    In [3]: a = np.ones((N,N))
    
    In [4]: %timeit b = np.zeros((a.shape[0],a.shape[1]+1)); b[:,:-1] = a
    1 loops, best of 3: 492 ms per loop
    
    In [5]: %timeit b = np.empty((a.shape[0],a.shape[1]+1)); b[:,:-1] = a; b[:,-1] = np.zeros((a.shape[0],))
    1 loops, best of 3: 407 ms per loop
    
    0 讨论(0)
  • 2020-11-22 15:34

    One way, using hstack, is:

    b = np.hstack((a, np.zeros((a.shape[0], 1), dtype=a.dtype)))
    
    0 讨论(0)
提交回复
热议问题