Exact match of string in pandas python

后端 未结 3 1245
醉话见心
醉话见心 2021-01-20 00:09

I have a column in data frame which ex df:

  A
0 Good to 1. Good communication EI : tathagata.kar@ae.com
1 SAP ECC Project System  EI: ram.vaddadi@ae.com
2          


        
相关标签:
3条回答
  • 2021-01-20 00:25

    Why not just use:

    df1 = df[df['A'].[str.match][1](ls[i])
    

    It's the equivalent of regex match.

    0 讨论(0)
  • 2021-01-20 00:36

    Thanks for the help. But seems like I found a solution that is working as of now.

    Must use str.contains(r'(?:\s|^|Ei:|EI:|EI-)'+ls[i]) This seems to solve the problem.

    Although thanks to @IsaacDj for his help.

    0 讨论(0)
  • 2021-01-20 00:44

    You could simply use ==

    string_a == string_b
    

    It should return True if the two strings are equal. But this does not solve your issue.

    Edit 2: You should use len(df1.index) instead of len(df1.columns). Indeed, len(df1.columns) will give you the number of columns, and not the number of rows.

    Edit 3: After reading your second post, I've understood your problem. The solution you propose could lead to some errors. For instance, if you have:

    ls=['tathagata.kar@ae.com','a.kar@ae.com', 'tathagata.kar@ae.co']
    

    the first and the third element will match str.contains(r'(?:\s|^|Ei:|EI:|EI-)'+ls[i]) And this is an unwanted behaviour.

    You could add a check on the end of the string: str.contains(r'(?:\s|^|Ei:|EI:|EI-)'+ls[i]+r'(?:\s|$)')

    Like this:

    for i in range(len(ls)):
      df1 = df[df['A'].str.contains(r'(?:\s|^|Ei:|EI:|EI-)'+ls[i]+r'(?:\s|$)')]
      if len(df1.index != 0):
          print (ls[i])
    

    (Remove parenthesis in the "print" if you use python 2.7)

    0 讨论(0)
提交回复
热议问题