Fast pandas filtering

后端 未结 3 667
后悔当初
后悔当初 2021-02-06 13:12

I want to filter a pandas dataframe, if the name column entry has an item in a given list.

Here we have a DataFrame

x = DataFrame(
    [[\'sam\', 328], [         


        
3条回答
  •  挽巷
    挽巷 (楼主)
    2021-02-06 13:50

    If I need to search on a field, I have noticed that it helps immensely if I change the index of the DataFrame to the search field. For one of my search and lookup requirements I got a performance improvement of around 500%.

    So in your case the following could be used to search and filter by name.

    df = pd.DataFrame([['sam', 328], ['ruby', 3213], ['jon', 121]], 
                     columns=['name', 'score'])
    names = ['sam', 'ruby']
    
    df_searchable = df.set_index('name')
    
    df_searchable[df_searchable.index.isin(names)]
    

提交回复
热议问题