How to add multiple extra columns to a NumPy array

倖福魔咒の 提交于 2021-02-08 05:16:33

问题


Let’s say I have two NumPy arrays, a and b:

a = np.array([
    [1, 2, 3],
    [2, 3, 4]
    ])

b = np.array([8,9])

And I would like to append the same array b to every row (ie. adding multiple columns) to get an array, c:

b = np.array([
    [1, 2, 3, 8, 9],
    [2, 3, 4, 8, 9]
    ])

How can I do this easily and efficiently in NumPy?

I am especially concerned about its behaviour with big datasets (where a is much bigger than b), is there any way around creating many copies (ie. a.shape[0]) of b?

Related to this question, but with multiple values.


回答1:


An alternative to concatenate approach is to make a recipient array, and copy values to it:

In [483]: a = np.arange(300).reshape(100,3)
In [484]: b=np.array([8,9])
In [485]: res = np.zeros((100,5),int)
In [486]: res[:,:3]=a
In [487]: res[:,3:]=b

sample timings

In [488]: %%timeit
     ...: res = np.zeros((100,5),int)
     ...: res[:,:3]=a
     ...: res[:,3:]=b
     ...: 
     ...: 
6.11 µs ± 20.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [491]: timeit np.concatenate((a, b.repeat(100).reshape(2,-1).T),1)
7.74 µs ± 15.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [164]: timeit np.concatenate([a, np.ones([a.shape[0],1], dtype=int).dot(np.array([b]))], axis=1) 
8.58 µs ± 160 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)



回答2:


Here's one way. I assume it's efficient because it's vectorised. It relies on the fact that in matrix multiplication, pre-multiplying a row by the column (1, 1) will produce two stacked copies of the row.

import numpy as np

a = np.array([
    [1, 2, 3],
    [2, 3, 4]
    ])

b = np.array([[8,9]])

np.concatenate([a, np.array([[1],[1]]).dot(b)], axis=1)

Out: array([[1, 2, 3, 8, 9],
            [2, 3, 4, 8, 9]])

Note that b is specified slightly differently (as a two-dimensional array).

Is there any way around creating many copies of b?

The final result contains those copies (and numpy arrays are literally arrays of values in memory), so I don't see how.




回答3:


The way I solved this initially was :

c = np.concatenate([a, np.tile(b, (a.shape[0],1))], axis = 1)

But this feels very inefficient...



来源:https://stackoverflow.com/questions/52132331/how-to-add-multiple-extra-columns-to-a-numpy-array

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!