Pandas: How to Compare Columns of Lists Row-wise in a DataFrame with Pandas (not for loop)?

后端未结

关注

 2  1385

DataFrame

df = pd.DataFrame({\'A\': [[\'gener\'], [\'gener\'], [\'system\'], [\'system\'], [\'gutter\'], [\'gutter\'], [\'gutter\'], [\'gutter\'


                      
              相关标签:


      
      
        
          2条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  北恋        
                
              
                            
                2021-02-05 21:51
              
            
            
                                                                       
To check if every item in df.A is contained in df.B:

>>> df.apply(lambda row: all(i in row.B for i in row.A), axis=1)
# OR: ~(df['A'].apply(set) - df['B'].apply(set)).astype(bool)
0     False
1     False
2      True
3      True
4      True
5      True
6      True
7      True
8      True
9      True
10     True
11     True
12     True
13     True
14     True
15     True
16     True
17     True
18     True
19     True
dtype: bool


To get the union:

df['intersection'] = [list(set(a).intersection(set(b))) for a, b in zip(df.A, df.B)]

>>> df
                     A                                      B        intersection
0              [gener]                               [gutter]                  []
1              [gener]                               [gutter]                  []
2             [system]                       [gutter, system]            [system]
3             [system]                [gutter, guard, system]            [system]
4             [gutter]                         [ohio, gutter]            [gutter]
5             [gutter]                       [gutter, toledo]            [gutter]
6             [gutter]                       [toledo, gutter]            [gutter]
7             [gutter]                               [gutter]            [gutter]
8             [gutter]                               [gutter]            [gutter]
9             [gutter]                               [gutter]            [gutter]
10          [aluminum]    [how, to, instal, aluminum, gutter]          [aluminum]
11          [aluminum]                     [aluminum, gutter]          [aluminum]
12          [aluminum]              [aluminum, gutter, color]          [aluminum]
13          [aluminum]                     [aluminum, gutter]          [aluminum]
14          [aluminum]       [aluminum, gutter, adrian, ohio]          [aluminum]
15          [aluminum]  [aluminum, gutter, bowl, green, ohio]          [aluminum]
16          [aluminum]        [aluminum, gutter, maume, ohio]          [aluminum]
17          [aluminum]   [aluminum, gutter, perrysburg, ohio]          [aluminum]
18          [aluminum]     [aluminum, gutter, tecumseh, ohio]          [aluminum]
19  [aluminum, toledo]       [aluminum, gutter, toledo, ohio]  [aluminum, toledo]

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  悲哀的现实        
                
              
                            
                2021-02-05 22:01
              
            
            
                                                                       
Just use the apply function supported by pandas, it's great. 

Since you may have more than two columns for intersecting, the auxiliary function can be prepared like this and then applied with the DataFrame.apply function (see http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.apply.html, note the option axis=1 means "across the series" while axis=0 means "along the series", where one
series is just one column in the data frame). Each row across the columns is then passed as a iterable Series object to the function applied.

def intersect(ss):
    ss = iter(ss)
    s = set(next(ss))
    for t in ss:
        s.intersection_update(t) # `t' must not be a `set' here, `list' or any `Iterable` is OK
    return s

res = df.apply(intersect, axis=1)

>>> res
0                     {}
1                     {}
2               {system}
3               {system}
4               {gutter}
5               {gutter}
6               {gutter}
7               {gutter}
8               {gutter}
9               {gutter}
10            {aluminum}
11            {aluminum}
12            {aluminum}
13            {aluminum}
14            {aluminum}
15            {aluminum}
16            {aluminum}
17            {aluminum}
18            {aluminum}
19    {aluminum, toledo}


You can augment further operations on the result of the auxiliary function, or make some variations similarly. 

Hope this helps.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复