问题
Let’s say I have two NumPy arrays, a
and b
:
a = np.array([
[1, 2, 3],
[2, 3, 4]
])
b = np.array([8,9])
And I would like to append the same array b
to every row (ie. adding multiple columns) to get an array, c
:
b = np.array([
[1, 2, 3, 8, 9],
[2, 3, 4, 8, 9]
])
How can I do this easily and efficiently in NumPy?
I am especially concerned about its behaviour with big datasets (where a
is much bigger than b
), is there any way around creating many copies (ie. a.shape[0]
) of b
?
Related to this question, but with multiple values.
回答1:
An alternative to concatenate
approach is to make a recipient array, and copy values to it:
In [483]: a = np.arange(300).reshape(100,3)
In [484]: b=np.array([8,9])
In [485]: res = np.zeros((100,5),int)
In [486]: res[:,:3]=a
In [487]: res[:,3:]=b
sample timings
In [488]: %%timeit
...: res = np.zeros((100,5),int)
...: res[:,:3]=a
...: res[:,3:]=b
...:
...:
6.11 µs ± 20.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [491]: timeit np.concatenate((a, b.repeat(100).reshape(2,-1).T),1)
7.74 µs ± 15.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [164]: timeit np.concatenate([a, np.ones([a.shape[0],1], dtype=int).dot(np.array([b]))], axis=1)
8.58 µs ± 160 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
回答2:
Here's one way. I assume it's efficient because it's vectorised. It relies on the fact that in matrix multiplication, pre-multiplying a row by the column (1, 1) will produce two stacked copies of the row.
import numpy as np
a = np.array([
[1, 2, 3],
[2, 3, 4]
])
b = np.array([[8,9]])
np.concatenate([a, np.array([[1],[1]]).dot(b)], axis=1)
Out: array([[1, 2, 3, 8, 9],
[2, 3, 4, 8, 9]])
Note that b
is specified slightly differently (as a two-dimensional array).
Is there any way around creating many copies of b?
The final result contains those copies (and numpy arrays are literally arrays of values in memory), so I don't see how.
回答3:
The way I solved this initially was :
c = np.concatenate([a, np.tile(b, (a.shape[0],1))], axis = 1)
But this feels very inefficient...
来源:https://stackoverflow.com/questions/52132331/how-to-add-multiple-extra-columns-to-a-numpy-array