Force numpy to create array of objects

后端 未结 4 1588
终归单人心
终归单人心 2020-11-30 13:08

I have an array:

x = np.array([[1, 2, 3], [4, 5, 6]])

and I want to create another array of shape=(1, 1) and dtype=np.ob

相关标签:
4条回答
  • 2020-11-30 13:40

    Found a solution myself:

    a=np.zeros(shape=(2, 2), dtype=np.object)
    a[:] = [[x, x], [x, x]]
    
    0 讨论(0)
  • 2020-11-30 13:45

    @PaulPanzer's use of np.frompyfunc is clever, but all that reshaping and use of __getitem__ makes it hard to understand:

    Separating the function creation from application might help:

    func = np.frompyfunc(np.reshape(data, (-1, *ish)).__getitem__, 1, 1)
    newarr = func(range(np.prod(osh))).reshape(osh)
    

    This highlights the separation between the ish dimensions and the osh ones.

    I also suspect a lambda function could substitute for the __getitem__.

    This works because frompyfunc returns an object dtype array. np.vectorize also uses frompyfunc but lets us specify a different otype. But both pass a scalar to the function, which is why Paul's approach uses a flattened range and getitem. np.vectorize with a signature lets us pass an array to the function, but it uses a ndindex iteration instead of frompyfunc.

    Inspired by that, here's a np.empty plus fill method - but with ndindex as the iterator:

    In [385]: >>> osh, ish = (2, 3), (2, 5)
         ...: >>> tsh = (*osh, *ish)
         ...: >>> data = np.arange(np.prod(tsh)).reshape(tsh)
         ...: >>> ish = np.shape(data)[len(osh):]
         ...: 
    In [386]: tsh
    Out[386]: (2, 3, 2, 5)
    In [387]: ish
    Out[387]: (2, 5)
    In [388]: osh
    Out[388]: (2, 3)
    In [389]: res = np.empty(osh, object)
    In [390]: for idx in np.ndindex(osh):
         ...:     res[idx] = data[idx]
         ...:     
    In [391]: res
    Out[391]: 
    array([[array([[0, 1, 2, 3, 4],
           [5, 6, 7, 8, 9]]),
           ....
           [55, 56, 57, 58, 59]])]], dtype=object)
    

    For the second example:

    In [399]: arr = np.array(data)
    In [400]: arr.shape
    Out[400]: (2, 2, 2, 3)
    In [401]: res = np.empty(osh, object)
    In [402]: for idx in np.ndindex(osh):
         ...:     res[idx] = arr[idx]
    

    In the third case, np.array(data) already creates the desired (2,2) object dtype array. This res create and fill still works, even though it produces the same thing.

    Speed isn't very different (though this example is small)

    In [415]: timeit data_blocked = np.frompyfunc(np.reshape(data, (-1, *ish)).__get
         ...: item__, 1, 1)(range(np.prod(osh))).reshape(osh)
    49.8 µs ± 172 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
    In [416]: %%timeit
         ...: arr = np.array(data)
         ...: res = np.empty(osh, object)
         ...: for idx in np.ndindex(osh): res[idx] = arr[idx]
         ...: 
    54.7 µs ± 68.7 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
    

    Note that when data is a (nested) list, np.reshape(data, (-1, *ish) is , effectively, np.array(data).reshape(-1 *ish). That list has to be first turned into an array.

    Besides speed, it would interesting to see whether one approach is more general than the other. Are there cases that one handles, but the other can't?

    0 讨论(0)
  • 2020-11-30 13:50
    a = np.empty(shape=(2, 2), dtype=np.object)
    a.fill(x)
    
    0 讨论(0)
  • 2020-11-30 13:51

    Here is a pretty general method: It works with nested lists, lists of lists of arrays - regardless of whether the shapes of these arrays are different or equal. It also works when the data come clumped together in one single array, which is in fact the trickiest case. (Other methods posted so far will not work in this case.)

    Let's start with the difficult case, one big array:

    # create example
    # pick outer shape and inner shape
    >>> osh, ish = (2, 3), (2, 5)
    # total shape
    >>> tsh = (*osh, *ish)
    # make data
    >>> data = np.arange(np.prod(tsh)).reshape(tsh)
    >>>
    # recalculate inner shape to cater for different inner shapes
    # this will return the consensus bit of all inner shapes
    >>> ish = np.shape(data)[len(osh):]
    >>> 
    # block them
    >>> data_blocked = np.frompyfunc(np.reshape(data, (-1, *ish)).__getitem__, 1, 1)(range(np.prod(osh))).reshape(osh)
    >>> 
    # admire
    >>> data_blocked
    array([[array([[0, 1, 2, 3, 4],
           [5, 6, 7, 8, 9]]),
            array([[10, 11, 12, 13, 14],
           [15, 16, 17, 18, 19]]),
            array([[20, 21, 22, 23, 24],
           [25, 26, 27, 28, 29]])],
           [array([[30, 31, 32, 33, 34],
           [35, 36, 37, 38, 39]]),
            array([[40, 41, 42, 43, 44],
           [45, 46, 47, 48, 49]]),
            array([[50, 51, 52, 53, 54],
           [55, 56, 57, 58, 59]])]], dtype=object)
    

    Using OP's example which is a list of lists of arrays:

    >>> x = np.array([[1, 2, 3], [4, 5, 6]])
    >>> y = np.array([[7, 8, 9], [0, 1, 2]])
    >>> u = np.array([[3, 4, 5], [6, 7, 8]])
    >>> v = np.array([[9, 0, 1], [2, 3, 4]])
    >>> data = [[x, y], [u, v]]
    >>> 
    >>> osh = (2,2)
    >>> ish = np.shape(data)[len(osh):]
    >>> 
    >>> data_blocked = np.frompyfunc(np.reshape(data, (-1, *ish)).__getitem__, 1, 1)(range(np.prod(osh))).reshape(osh)
    >>> data_blocked
    array([[array([[1, 2, 3],
           [4, 5, 6]]),
            array([[7, 8, 9],
           [0, 1, 2]])],
           [array([[3, 4, 5],
           [6, 7, 8]]),
            array([[9, 0, 1],
           [2, 3, 4]])]], dtype=object)
    

    And an example with different shape subarrays (note the v.T):

    >>> data = [[x, y], [u, v.T]]
    >>> 
    >>> osh = (2,2)
    >>> ish = np.shape(data)[len(osh):]
    >>> data_blocked = np.frompyfunc(np.reshape(data, (-1, *ish)).__getitem__, 1, 1)(range(np.prod(osh))).reshape(osh)>>> data_blocked
    array([[array([[1, 2, 3],
           [4, 5, 6]]),
            array([[7, 8, 9],
           [0, 1, 2]])],
           [array([[3, 4, 5],
           [6, 7, 8]]),
            array([[9, 2],
           [0, 3],
           [1, 4]])]], dtype=object)
    
    0 讨论(0)
提交回复
热议问题