Let\'s write it directly in code
Note: I edited mapper (original example use x -> (x, 2 * x, 3 * x) just for example), to generic blackbox function, which cause the trou
np.vectorize
with the new signature option can handle this. It doesn't improve the speed, but makes the dimensional bookkeeping easier.
In [159]: def blackbox_fn(x): #I can't be changed!
...: assert np.array(x).shape == (), "I'm a fussy little function!"
...: return np.array([x, 2*x, 3*x])
...:
The documentation for signature
is a bit cryptic. I've worked with it before, so made a good first guess:
In [161]: f = np.vectorize(blackbox_fn, signature='()->(n)')
In [162]: f(np.ones((2,2)))
Out[162]:
array([[[ 1., 2., 3.],
[ 1., 2., 3.]],
[[ 1., 2., 3.],
[ 1., 2., 3.]]])
With your array:
In [163]: arr2d = np.array(list(range(4)), dtype=np.uint8).reshape(2, 2)
In [164]: f(arr2d)
Out[164]:
array([[[0, 0, 0],
[1, 2, 3]],
[[2, 4, 6],
[3, 6, 9]]])
In [165]: _.dtype
Out[165]: dtype('int32')
The dtype
is not preserved, because your blackbox_fn
doesn't preserve it. As a default vectorize
makes a test calculation with the first element, and uses its dtype
to determine the result's dtype. It is possible to specify return dtype with the otypes
parameter.
It can handle arrays other than 2d:
In [166]: f(np.arange(3))
Out[166]:
array([[0, 0, 0],
[1, 2, 3],
[2, 4, 6]])
In [167]: f(3)
Out[167]: array([3, 6, 9])
With a signature
vectorize
is using a Python level iteration. Without a signature it uses np.frompyfunc
, with a bit better performance. But as long as blackbox_fn
has to be called for element of the input, we can't improve the speed by much (at most 2x).
np.frompyfunc
returns a object dtype array:
In [168]: fpy = np.frompyfunc(blackbox_fn, 1,1)
In [169]: fpy(1)
Out[169]: array([1, 2, 3])
In [170]: fpy(np.arange(3))
Out[170]: array([array([0, 0, 0]), array([1, 2, 3]), array([2, 4, 6])], dtype=object)
In [171]: np.stack(_)
Out[171]:
array([[0, 0, 0],
[1, 2, 3],
[2, 4, 6]])
In [172]: fpy(arr2d)
Out[172]:
array([[array([0, 0, 0]), array([1, 2, 3])],
[array([2, 4, 6]), array([3, 6, 9])]], dtype=object)
stack
can't remove the array nesting in this 2d case:
In [173]: np.stack(_)
Out[173]:
array([[array([0, 0, 0]), array([1, 2, 3])],
[array([2, 4, 6]), array([3, 6, 9])]], dtype=object)
but we can ravel it, and stack. It needs a reshape
:
In [174]: np.stack(__.ravel())
Out[174]:
array([[0, 0, 0],
[1, 2, 3],
[2, 4, 6],
[3, 6, 9]])
Speed tests:
In [175]: timeit f(np.arange(1000))
14.7 ms ± 322 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [176]: timeit fpy(np.arange(1000))
4.57 ms ± 161 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [177]: timeit np.stack(fpy(np.arange(1000).ravel()))
6.71 ms ± 207 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [178]: timeit np.array([blackbox_fn(i) for i in np.arange(1000)])
6.44 ms ± 235 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Having your function return a list instead of any array might make reassembling the result easier, and maybe even faster
def foo(x):
return [x, 2*x, 3*x]
or playing about with the frompyfunc
parameters;
def foo(x):
return x, 2*x, 3*x # return a tuple
In [204]: np.stack(np.frompyfunc(foo, 1,3)(arr2d),2)
Out[204]:
array([[[0, 0, 0],
[1, 2, 3]],
[[2, 4, 6],
[3, 6, 9]]], dtype=object)
10x speed up - I'm surprised:
In [212]: foo1 = np.frompyfunc(foo, 1,3)
In [213]: timeit np.stack(foo1(np.arange(1000)),1)
428 µs ± 17.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
You can use basic NumPy broadcasting for these kind of "outer products"
np.arange(3)[:, None] * np.arange(2)
# array([[0, 0],
# [0, 1],
# [0, 2]])
In your case it would be
def mapper(x):
return (np.arange(3)[:, None, None] * x).transpose((1, 2, 0))
note the .transpose()
is only needed if you specifically need the new axis to be at the end.
And it is almost 3x as fast as stacking 3 separate multiplications:
def mapper(x):
return (np.arange(3)[:, None, None] * x).transpose((1, 2, 0))
def mapper2(x):
return np.stack((x, 2 * x, 3 * x), axis = -1)
a = np.arange(30000).reshape(-1, 30)
%timeit mapper(a) # 48.2 µs ± 417 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit mapper2(a) # 137 µs ± 3.57 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
I might be getting this wrong, but comprehension does the job:
a = np.array([[0, 1],
[2, 3]])
np.array([[[j, j*2, j*3] for j in i] for i in a ])
#[[[0 0 0]
# [1 2 3]]
#
# [[2 4 6]
# [3 6 9]]]