How to create a numpy array of lists?

前端 未结 6 2079
攒了一身酷
攒了一身酷 2020-12-01 12:02

I want to create a numpy array in which each element must be a list, so later I can append new elements to each.

I have looked on google and here on stack overflow a

相关标签:
6条回答
  • 2020-12-01 12:15
    data = np.empty(20, dtype=np.object)
    for i in range(data.shape[0]):
        data[i] = []
        data[i].append(i)
    print(data)
    

    The result will be:

    [list([0]) list([1]) list([2]) list([3]) list([4]) list([5]) list([6]) list([7]) list([8]) list([9]) list([10]) list([11]) list([12]) list([13]) list([14]) list([15]) list([16]) list([17]) list([18]) list([19])]
    
    0 讨论(0)
  • 2020-12-01 12:19

    Lists aren't very numpy anyway, so maybe a tuple of lists is good enough for you. You can get that easily and rather efficiently with an iterator expression:

    fiveLists = tuple([] for _ in range(5))
    

    You can leave out the tuple if you only need it once (gives you the raw iterator).

    You can use this to create a numpy array if you really want to:

    arrayOfLists = np.fromiter(([] for _ in range(5)), object)
    

    Edit: as of July 2020, you get "ValueError: cannot create object arrays from iterator"

    0 讨论(0)
  • 2020-12-01 12:24

    If you really need a 1-d array of lists you will have to wrap your lists in your own class as numpy will always try to convert your lists to arrays inside of an array (which is more efficient but obviously requires constant size-elements), for example through

    class mylist:
    
        def __init__(self, l):
            self.l=l
    
        def __repr__(self): 
            return repr(self.l)
    
        def append(self, x):
            self.l.append(x)
    

    and then you can change any element without changing the dimension of others

    >>> x = mylist([1,2,3])
    >>> y = mylist([1,2,3])
    >>> import numpy as np
    >>> data = np.array([x,y])
    >>> data
    array([[1,2,3], [1,2,3]], dtype=object)
    >>> data[0].append(2)
    >>> data
    array([[1,2,3,2], [1,2,3]], dtype=object)
    

    Update

    As suggested by ali_m there is actually a way to force numpy to simply create a 1-d array for references and then feed them with actual lists

    >>> data = np.empty(2, dtype=np.object)
    >>> data[:] = [1, 2, 3], [1, 2, 3]
    >>> data
    array([[1, 2, 3], [1, 2, 3]], dtype=object)
    >>> data[0].append(4)
    >>> data
    array([[1, 2, 3, 4], [1, 2, 3]], dtype=object)
    
    0 讨论(0)
  • 2020-12-01 12:25

    As you discovered, np.array tries to create a 2d array when given something like

     A = np.array([[1,2],[3,4]],dtype=object)
    

    You have apply some tricks to get around this default behavior.

    One is to make the sublists variable in length. It can't make a 2d array from these, so it resorts to the object array:

    In [43]: A=np.array([[1,2],[],[1,2,3,4]])
    In [44]: A
    Out[44]: array([[1, 2], [], [1, 2, 3, 4]], dtype=object)
    

    And you can then append values to each of those lists:

    In [45]: for i in A: i.append(34)
    In [46]: A
    Out[46]: array([[1, 2, 34], [34], [1, 2, 3, 4, 34]], dtype=object)
    

    np.empty also creates an object array:

    In [47]: A=np.empty((3,),dtype=object)
    In [48]: A
    Out[48]: array([None, None, None], dtype=object)
    

    But you then have to be careful how you change the elements to lists. np.fill is tempting, but has problems:

    In [49]: A.fill([])
    In [50]: A
    Out[50]: array([[], [], []], dtype=object)
    In [51]: for i in A: i.append(34)
    In [52]: A
    Out[52]: array([[34, 34, 34], [34, 34, 34], [34, 34, 34]], dtype=object)
    

    It turns out that fill puts the same list in all slots, so modifying one modifies all the others. You can get the same problem with a list of lists:

    In [53]: B=[[]]*3
    In [54]: B
    Out[54]: [[], [], []]
    In [55]: for i in B: i.append(34)
    In [56]: B
    Out[56]: [[34, 34, 34], [34, 34, 34], [34, 34, 34]]
    

    The proper way to initial the empty A is with an iteration, e.g.

    In [65]: A=np.empty((3,),dtype=object)
    In [66]: for i,v in enumerate(A): A[i]=[v,i]
    In [67]: A
    Out[67]: array([[None, 0], [None, 1], [None, 2]], dtype=object)
    In [68]: for v in A: v.append(34)
    In [69]: A
    Out[69]: array([[None, 0, 34], [None, 1, 34], [None, 2, 34]], dtype=object)
    

    It's a little unclear from the question and comments whether you want to append to the lists, or append lists to the array. I've just demonstrated appending to the lists.

    There is an np.append function, which new users often misuse. It isn't a substitute for list append. It is a front end to np.concatenate. It is not an in-place operation; it returns a new array.

    Also defining a list to add with it can be tricky:

    In [72]: np.append(A,[[1,23]])
    Out[72]: array([[None, 0, 34], [None, 1, 34], [None, 2, 34], 1, 23],     dtype=object)
    

    You need to construct another object array to concatenate to the original, e.g.

    In [76]: np.append(A,np.empty((1,),dtype=object))
    Out[76]: array([[None, 0, 34], [None, 1, 34], [None, 2, 34], None], dtype=object)
    

    In all of this, an array of lists is harder to construct than a list of lists, and no easier, or faster, to manipulate. You have to make it a 2d array of lists to derive some benefit.

    In [78]: A[:,None]
    Out[78]: 
    array([[[None, 0, 34]],
           [[None, 1, 34]],
           [[None, 2, 34]]], dtype=object)
    

    You can reshape, transpose, etc an object array, where as creating and manipulating a list of lists of lists gets more complicated.

    In [79]: A[:,None].tolist()
    Out[79]: [[[None, 0, 34]], [[None, 1, 34]], [[None, 2, 34]]]
    

    ===

    As shown in https://stackoverflow.com/a/57364472/901925, np.frompyfunc is a good tool for creating an array of objects.

    np.frompyfunc(list, 0, 1)(np.empty((3,2), dtype=object))  
    
    0 讨论(0)
  • 2020-12-01 12:26

    Just found this, I've never answered a question before, but here is a pretty simple solution:

    If you want a vector of length n, use:

    A = np.array([[]]*n + [[1]])[:-1]
    

    This returns:

    array([list([]), list([]), ... , list([])], dtype=object)
    

    If instead you want an n by m array, use:

    A = np.array([[]]*n*m + [[1]])[:-1]
    B = A.reshape((n,m))
    

    For higher rank arrays, you can use a similar method by creating a long vector and reshaping it. This may not be the most efficient way, but it worked for me.

    0 讨论(0)
  • 2020-12-01 12:35

    A simple way would be:

    A = [[1,2],[3,4]] 
    B = np.array(A+[[]])[:-1]
    
    0 讨论(0)
提交回复
热议问题