how to match a word in a datacolumn with a list of values and applying ignorecase in pandas in python

前端 未结 2 1098
太阳男子
太阳男子 2020-12-22 12:02

I have a df,

Name
Ram is one of the key ram
Kumar is playing cricket
Ravi is playing and ravi is a good player

and a list

m         


        
相关标签:
2条回答
  • 2020-12-22 12:39
    In [187]: pat = '({})'.format('|'.join(my_list))
    
    In [188]: df['Match'] = df['Name'].str.extract(pat, expand=False)
    
    In [190]: df['Count'] = df.Name.str.count(pat)
    
    In [191]: df
    Out[191]:
                                                    Name Match  Count
    0                          Ram is one of the key ram   Ram      1
    1                           Kumar is playing cricket   NaN      0
    2  Ravi is playing and ravi (ravi ravi) is a good...  ravi      3  # i've intentionally added `(ravi ravi)`
    
    0 讨论(0)
  • 2020-12-22 12:43

    Is this what you are looking for ?

    new_l = [i.lower() for i in my_list]
    extracted = df['Name'].str.lower().str.findall('(' + '|'.join(new_l) + ')').apply(set)
    
    
    df['Match'] = extracted.apply(','.join)
    df['count'] = extracted.apply(len)
    
                                              Name     Match  count
    0                      Ram is one of the key ram       ram      1
    1                       Kumar is playing cricket                0
    2  Ravi Ram is playing and ravi is a good player  ram,ravi      2
    
    0 讨论(0)
提交回复
热议问题