I have an array:
x = np.array([[1, 2, 3], [4, 5, 6]])
and I want to create another array of shape=(1, 1)
and dtype=np.ob
Found a solution myself:
a=np.zeros(shape=(2, 2), dtype=np.object)
a[:] = [[x, x], [x, x]]
@PaulPanzer's use of np.frompyfunc
is clever, but all that reshaping
and use of __getitem__
makes it hard to understand:
Separating the function creation from application might help:
func = np.frompyfunc(np.reshape(data, (-1, *ish)).__getitem__, 1, 1)
newarr = func(range(np.prod(osh))).reshape(osh)
This highlights the separation between the ish
dimensions and the osh
ones.
I also suspect a lambda
function could substitute for the __getitem__
.
This works because frompyfunc
returns an object dtype array. np.vectorize
also uses frompyfunc
but lets us specify a different otype
. But both pass a scalar to the function, which is why Paul's approach uses a flattened range
and getitem
. np.vectorize
with a signature
lets us pass an array to the function, but it uses a ndindex
iteration instead of frompyfunc
.
Inspired by that, here's a np.empty
plus fill method - but with ndindex
as the iterator:
In [385]: >>> osh, ish = (2, 3), (2, 5)
...: >>> tsh = (*osh, *ish)
...: >>> data = np.arange(np.prod(tsh)).reshape(tsh)
...: >>> ish = np.shape(data)[len(osh):]
...:
In [386]: tsh
Out[386]: (2, 3, 2, 5)
In [387]: ish
Out[387]: (2, 5)
In [388]: osh
Out[388]: (2, 3)
In [389]: res = np.empty(osh, object)
In [390]: for idx in np.ndindex(osh):
...: res[idx] = data[idx]
...:
In [391]: res
Out[391]:
array([[array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]]),
....
[55, 56, 57, 58, 59]])]], dtype=object)
For the second example:
In [399]: arr = np.array(data)
In [400]: arr.shape
Out[400]: (2, 2, 2, 3)
In [401]: res = np.empty(osh, object)
In [402]: for idx in np.ndindex(osh):
...: res[idx] = arr[idx]
In the third case, np.array(data)
already creates the desired (2,2) object dtype array. This res
create and fill still works, even though it produces the same thing.
Speed isn't very different (though this example is small)
In [415]: timeit data_blocked = np.frompyfunc(np.reshape(data, (-1, *ish)).__get
...: item__, 1, 1)(range(np.prod(osh))).reshape(osh)
49.8 µs ± 172 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [416]: %%timeit
...: arr = np.array(data)
...: res = np.empty(osh, object)
...: for idx in np.ndindex(osh): res[idx] = arr[idx]
...:
54.7 µs ± 68.7 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
Note that when data
is a (nested) list, np.reshape(data, (-1, *ish)
is , effectively, np.array(data).reshape(-1 *ish)
. That list has to be first turned into an array.
Besides speed, it would interesting to see whether one approach is more general than the other. Are there cases that one handles, but the other can't?
a = np.empty(shape=(2, 2), dtype=np.object)
a.fill(x)
Here is a pretty general method: It works with nested lists, lists of lists of arrays - regardless of whether the shapes of these arrays are different or equal. It also works when the data come clumped together in one single array, which is in fact the trickiest case. (Other methods posted so far will not work in this case.)
Let's start with the difficult case, one big array:
# create example
# pick outer shape and inner shape
>>> osh, ish = (2, 3), (2, 5)
# total shape
>>> tsh = (*osh, *ish)
# make data
>>> data = np.arange(np.prod(tsh)).reshape(tsh)
>>>
# recalculate inner shape to cater for different inner shapes
# this will return the consensus bit of all inner shapes
>>> ish = np.shape(data)[len(osh):]
>>>
# block them
>>> data_blocked = np.frompyfunc(np.reshape(data, (-1, *ish)).__getitem__, 1, 1)(range(np.prod(osh))).reshape(osh)
>>>
# admire
>>> data_blocked
array([[array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]]),
array([[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]]),
array([[20, 21, 22, 23, 24],
[25, 26, 27, 28, 29]])],
[array([[30, 31, 32, 33, 34],
[35, 36, 37, 38, 39]]),
array([[40, 41, 42, 43, 44],
[45, 46, 47, 48, 49]]),
array([[50, 51, 52, 53, 54],
[55, 56, 57, 58, 59]])]], dtype=object)
Using OP's example which is a list of lists of arrays:
>>> x = np.array([[1, 2, 3], [4, 5, 6]])
>>> y = np.array([[7, 8, 9], [0, 1, 2]])
>>> u = np.array([[3, 4, 5], [6, 7, 8]])
>>> v = np.array([[9, 0, 1], [2, 3, 4]])
>>> data = [[x, y], [u, v]]
>>>
>>> osh = (2,2)
>>> ish = np.shape(data)[len(osh):]
>>>
>>> data_blocked = np.frompyfunc(np.reshape(data, (-1, *ish)).__getitem__, 1, 1)(range(np.prod(osh))).reshape(osh)
>>> data_blocked
array([[array([[1, 2, 3],
[4, 5, 6]]),
array([[7, 8, 9],
[0, 1, 2]])],
[array([[3, 4, 5],
[6, 7, 8]]),
array([[9, 0, 1],
[2, 3, 4]])]], dtype=object)
And an example with different shape subarrays (note the v.T
):
>>> data = [[x, y], [u, v.T]]
>>>
>>> osh = (2,2)
>>> ish = np.shape(data)[len(osh):]
>>> data_blocked = np.frompyfunc(np.reshape(data, (-1, *ish)).__getitem__, 1, 1)(range(np.prod(osh))).reshape(osh)>>> data_blocked
array([[array([[1, 2, 3],
[4, 5, 6]]),
array([[7, 8, 9],
[0, 1, 2]])],
[array([[3, 4, 5],
[6, 7, 8]]),
array([[9, 2],
[0, 3],
[1, 4]])]], dtype=object)