Append Row(s) to a NumPy Record Array

前端 未结 3 591
忘了有多久
忘了有多久 2021-02-06 06:33

Is there a way to append a row to a NumPy rec.array()? For example,

x1=np.array([1,2,3,4])
x2=np.array([\'a\',\'dd\',\'xyz\',\'12\'])
x3=np.array([1.1,2,3,4])
r         


        
相关标签:
3条回答
  • 2021-02-06 06:44

    Extending @unutbu's answer I post a more general function that appends any number of rows:

    def append_rows(arrayIN, NewRows):
        """Append rows to numpy recarray.
    
        Arguments:
          arrayIN: a numpy recarray that should be expanded
          NewRows: list of tuples with the same shape as `arrayIN`
    
        Idea: Resize recarray in-place if possible.
        (only for small arrays reasonable)
    
        >>> arrayIN = np.array([(1, 'a', 1.1), (2, 'dd', 2.0), (3, 'x', 3.0)],
                               dtype=[('a', '<i4'), ('b', '|S3'), ('c', '<f8')])
        >>> NewRows = [(4, '12', 4.0), (5, 'cc', 43.0)]
        >>> append_rows(arrayIN, NewRows)
        >>> print(arrayIN)
        [(1, 'a', 1.1) (2, 'dd', 2.0) (3, 'x', 3.0) (4, '12', 4.0) (5, 'cc', 43.0)]
    
        Source: http://stackoverflow.com/a/1731228/2062965
        """
        # Calculate the number of old and new rows
        len_arrayIN = arrayIN.shape[0]
        len_NewRows = len(NewRows)
        # Resize the old recarray
        arrayIN.resize(len_arrayIN + len_NewRows, refcheck=False)
        # Write to the end of recarray
        arrayIN[-len_NewRows:] = NewRows
    

    Comment

    I want to stress that pre-allocation of an array, which is at least big enough, is the most reasonable solution (if you have an idea about the final size of the array)! Pre-allocation also saves you a lot of time.

    0 讨论(0)
  • 2021-02-06 06:51
    np.core.records.fromrecords(r.tolist()+[(5,'cc',43.)])
    

    Still it does split, this time by rows. Maybe better?

    0 讨论(0)
  • 2021-02-06 06:53

    You can resize numpy arrays in-place. This is faster than converting to lists and then back to numpy arrays, and it uses less memory too.

    print (r.shape)
    # (4,)
    r.resize(5)   
    print (r.shape)
    # (5,)
    r[-1] = (5,'cc',43.0)
    print(r)
    
    # [(1, 'a', 1.1000000000000001) 
    #  (2, 'dd', 2.0) 
    #  (3, 'xyz', 3.0) 
    #  (4, '12', 4.0)
    #  (5, 'cc', 43.0)]
    

    If there is not enough memory to expand an array in-place, the resizing (or appending) operation may force NumPy to allocate space for an entirely new array and copy the old data to the new location. That, naturally, is rather slow so you should try to avoid using resize or append if possible. Instead, pre-allocate arrays of sufficient size from the very beginning (even if somewhat larger than ultimately necessary).

    0 讨论(0)
提交回复
热议问题