Flattening a list of NumPy arrays?

前端未结

关注

 5  1572

It appears that I have data in the format of a list of NumPy arrays (type() = np.ndarray):

[array([[ 0.00353654]]), array([[ 0.00353654]]), arra


                      
              相关标签:


      
      
        
          5条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  攒了一身酷        
                
              
                            
                2020-12-01 06:31
              
            
            
                                                                       
You could use numpy.concatenate, which as the name suggests, basically concatenates all the elements of such an input list into a single NumPy array, like so -

import numpy as np
out = np.concatenate(input_list).ravel()


If you wish the final output to be a list, you can extend the solution, like so -

out = np.concatenate(input_list).ravel().tolist()


Sample run -

In [24]: input_list
Out[24]: 
[array([[ 0.00353654]]),
 array([[ 0.00353654]]),
 array([[ 0.00353654]]),
 array([[ 0.00353654]]),
 array([[ 0.00353654]]),
 array([[ 0.00353654]]),
 array([[ 0.00353654]]),
 array([[ 0.00353654]]),
 array([[ 0.00353654]]),
 array([[ 0.00353654]]),
 array([[ 0.00353654]]),
 array([[ 0.00353654]]),
 array([[ 0.00353654]])]

In [25]: np.concatenate(input_list).ravel()
Out[25]: 
array([ 0.00353654,  0.00353654,  0.00353654,  0.00353654,  0.00353654,
        0.00353654,  0.00353654,  0.00353654,  0.00353654,  0.00353654,
        0.00353654,  0.00353654,  0.00353654])


Convert to list -

In [26]: np.concatenate(input_list).ravel().tolist()
Out[26]: 
[0.00353654,
 0.00353654,
 0.00353654,
 0.00353654,
 0.00353654,
 0.00353654,
 0.00353654,
 0.00353654,
 0.00353654,
 0.00353654,
 0.00353654,
 0.00353654,
 0.00353654]

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  终归单人心        
                
              
                            
                2020-12-01 06:34
              
            
            
                                                                       
I came across this same issue and found a solution that combines 1-D numpy arrays of variable length:

np.column_stack(input_list).ravel()


See numpy.column_stack for more info.

Example with variable-length arrays with your example data:

In [135]: input_list
Out[135]: 
[array([[ 0.00353654,  0.00353654]]),
 array([[ 0.00353654]]),
 array([[ 0.00353654]]),
 array([[ 0.00353654,  0.00353654,  0.00353654]])]

In [136]: [i.size for i in input_list]    # variable size arrays
Out[136]: [2, 1, 1, 3]

In [137]: np.column_stack(input_list).ravel()
Out[137]: 
array([ 0.00353654,  0.00353654,  0.00353654,  0.00353654,  0.00353654,
        0.00353654,  0.00353654])


Note: Only tested on Python 2.7.12
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  刺人心        
                
              
                            
                2020-12-01 06:40
              
            
            
                                                                       
Another simple approach would be to use numpy.hstack() followed by removing the singleton dimension using squeeze() as in:

In [61]: np.hstack(list_of_arrs).squeeze()
Out[61]: 
array([0.00353654, 0.00353654, 0.00353654, 0.00353654, 0.00353654,
       0.00353654, 0.00353654, 0.00353654, 0.00353654, 0.00353654,
       0.00353654, 0.00353654, 0.00353654])

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  轻奢々        
                
              
                            
                2020-12-01 06:42
              
            
            
                                                                       
Can also be done by

np.array(list_of_arrays).flatten().tolist()


resulting in

[0.00353654, 0.00353654, 0.00353654, 0.00353654, 0.00353654, 0.00353654, 0.00353654, 0.00353654, 0.00353654, 0.00353654, 0.00353654, 0.00353654, 0.00353654]




Update

As @aydow points out in the comments, using numpy.ndarray.ravel can be faster if one doesn't care about getting a copy or a view

np.array(list_of_arrays).ravel()


Although, according to docs


  When a view is desired in as many cases as possible, arr.reshape(-1) may be preferable.


In other words

np.array(list_of_arrays).reshape(-1)


The initial suggestion of mine was to use numpy.ndarray.flatten that returns a copy every time which affects performance.

Let's now see how the time complexity of the above-listed solutions compares using perfplot package for a setup similar to the one of the OP

import perfplot

perfplot.show(
    setup=lambda n: np.random.rand(n, 2),
    kernels=[lambda a: a.ravel(),
             lambda a: a.flatten(),
             lambda a: a.reshape(-1)],
    labels=['ravel', 'flatten', 'reshape'],
    n_range=[2**k for k in range(16)],
    xlabel='N')




Here flatten demonstrates piecewise linear complexity which can be reasonably explained by it making a copy of the initial array compare to constant complexities of ravel and reshape that return a view.

It's also worth noting that, quite predictably, converting the outputs .tolist() evens out the performance of all three to equally linear.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  孤城傲影        
                
              
                            
                2020-12-01 06:44
              
            
            
                                                                       
Another way using itertools for flattening the array:

import itertools

# Recreating array from question
a = [np.array([[0.00353654]])] * 13

# Make an iterator to yield items of the flattened list and create a list from that iterator
flattened = list(itertools.chain.from_iterable(a))


This solution should be very fast, see https://stackoverflow.com/a/408281/5993892 for more explanation.

If the resulting data structure should be a numpy array instead, use numpy.fromiter() to exhaust the iterator into an array:

# Make an iterator to yield items of the flattened list and create a numpy array from that iterator
flattened_array = np.fromiter(itertools.chain.from_iterable(a), float)


Docs for itertools.chain.from_iterable(): 
https://docs.python.org/3/library/itertools.html#itertools.chain.from_iterable

Docs for numpy.fromiter(): 
https://docs.scipy.org/doc/numpy/reference/generated/numpy.fromiter.html
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复