Group and average NumPy matrix

前端未结

关注

 4  1142

Say I have an arbitrary numpy matrix that looks like this:

arr = [[  6.0   12.0   1.0]
       [  7.0   9.0   1.0]
       [  8.0   7.0   1.0]
       [  4.0   3.0


                      
              相关标签:


      
      
        
          4条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  抹茶落季        
                
              
                            
                2021-02-20 08:07
              
            
            
                                                                       
A compact solution is to use numpy_indexed (disclaimer: I am its author), which implements a fully vectorized solution:

import numpy_indexed as npi
npi.group_by(arr[:, 2]).mean(arr)

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  粉色の甜心        
                
              
                            
                2021-02-20 08:15
              
            
            
                                                                       
arr = np.array(
[[  6.0,   12.0,   1.0],
 [  7.0,   9.0,   1.0],
 [  8.0,   7.0,   1.0],
 [  4.0,   3.0,   2.0],
 [  6.0,   1.0,   2.0],
 [  2.0,   5.0,   2.0],
 [  9.0,   4.0,   3.0],
 [  2.0,   1.0,   4.0],
 [  8.0,   4.0,   4.0],
 [  3.0,   5.0,   4.0]])
np.array([a.mean(0) for a in np.split(arr, np.argwhere(np.diff(arr[:, 2])) + 1)])

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  遇见更好的自我        
                
              
                            
                2021-02-20 08:16
              
            
            
                                                                       
You can do:

for x in sorted(np.unique(arr[...,2])):
    results.append([np.average(arr[np.where(arr[...,2]==x)][...,0]), 
                    np.average(arr[np.where(arr[...,2]==x)][...,1]),
                    x])


Testing:

>>> arr
array([[  6.,  12.,   1.],
       [  7.,   9.,   1.],
       [  8.,   7.,   1.],
       [  4.,   3.,   2.],
       [  6.,   1.,   2.],
       [  2.,   5.,   2.],
       [  9.,   4.,   3.],
       [  2.,   1.,   4.],
       [  8.,   4.,   4.],
       [  3.,   5.,   4.]])
>>> results=[]
>>> for x in sorted(np.unique(arr[...,2])):
...     results.append([np.average(arr[np.where(arr[...,2]==x)][...,0]), 
...                     np.average(arr[np.where(arr[...,2]==x)][...,1]),
...                     x])
... 
>>> results
[[7.0, 9.3333333333333339, 1.0], [4.0, 3.0, 2.0], [9.0, 4.0, 3.0], [4.333333333333333, 3.3333333333333335, 4.0]]


The array arr does not need to be sorted, and all the intermediate arrays are views (ie, not new arrays of data). The average is calculated efficiently directly from those views.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  离开以前        
                
              
                            
                2021-02-20 08:23
              
            
            
                                                                       
solution

from itertools import groupby
from operator import itemgetter

arr = [[6.0, 12.0, 1.0],
       [7.0, 9.0, 1.0],
       [8.0, 7.0, 1.0],
       [4.0, 3.0, 2.0],
       [6.0, 1.0, 2.0],
       [2.0, 5.0, 2.0],
       [9.0, 4.0, 3.0],
       [2.0, 1.0, 4.0],
       [8.0, 4.0, 4.0],
       [3.0, 5.0, 4.0]]

result = []

for groupByID, rows in groupby(arr, key=itemgetter(2)):
    position1, position2, counter = 0, 0, 0
    for row in rows:
        position1+=row[0]
        position2+=row[1]
        counter+=1
    result.append([position1/counter, position2/counter, groupByID])

print(result)


would output:

[[7.0, 9.333333333333334, 1.0]]
[[4.0, 3.0, 2.0]]
[[9.0, 4.0, 3.0]]
[[4.333333333333333, 3.3333333333333335, 4.0]]

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复