Select rows of pandas dataframe from list, in order of list

前端未结

关注

 3  1673

The question was originally asked here as a comment but could not get a proper answer as the question was marked as a duplicate.

For a given pandas.Da


                      
              相关标签:


      
      
        
          3条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  耶瑟儿～        
                
              
                            
                2020-12-17 00:24
              
            
            
                                                                       
One way to overcome this is to make the 'A' column an index and use loc on the newly generated pandas.DataFrame. Eventually, the subsampled dataframe's index can be reset.

Here is how:

ret = df.set_index('A').loc[list_of_values].reset_index(inplace=False)

# ret is
#      A   B
# 0    3   3
# 1    4   5
# 2    6   2 


Note that the drawback of this method is that the original indexing has been lost in the process.

More on pandas indexing: What is the point of indexing in pandas?
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  忘掉有多难        
                
              
                            
                2020-12-17 00:28
              
            
            
                                                                       
1] Generic approach for list_of_values.

In [936]: dff = df[df.A.isin(list_of_values)]

In [937]: dff.reindex(dff.A.map({x: i for i, x in enumerate(list_of_values)}).sort_values().index)
Out[937]:
   A  B
2  3  3
3  4  5
1  6  2


2] If list_of_values is sorted. You can use

In [926]: df[df.A.isin(list_of_values)].sort_values(by='A')
Out[926]:
   A  B
2  3  3
3  4  5
1  6  2

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  清酒与你        
                
              
                            
                2020-12-17 00:40
              
            
            
                                                                       
Use merge with helper DataFrame created by list and with column name of matched column:

df = pd.DataFrame({'A' : [5,6,3,4], 'B' : [1,2,3,5]})

list_of_values = [3,6,4]
df1 = pd.DataFrame({'A':list_of_values}).merge(df)
print (df1)
   A  B
0  3  3
1  6  2
2  4  5


For more general solution:

df = pd.DataFrame({'A' : [5,6,5,3,4,4,6,5], 'B':range(8)})
print (df)
   A  B
0  5  0
1  6  1
2  5  2
3  3  3
4  4  4
5  4  5
6  6  6
7  5  7

list_of_values = [6,4,3,7,7,4]




#create df from list 
list_df = pd.DataFrame({'A':list_of_values})
print (list_df)
   A
0  6
1  4
2  3
3  7
4  7
5  4

#column for original index values
df1 = df.reset_index()
#helper column for count duplicates values
df1['g'] = df1.groupby('A').cumcount()
list_df['g'] = list_df.groupby('A').cumcount()

#merge together, create index from column and remove g column
df = list_df.merge(df1).set_index('index').rename_axis(None).drop('g', axis=1)
print (df)
   A  B
1  6  1
4  4  4
3  3  3
5  4  5

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复