Check for words from list and remove those words in pandas dataframe column

后端 未结 2 863
闹比i
闹比i 2021-01-04 10:36

I have a list as follows,

remove_words = [\'abc\', \'deff\', \'pls\']

The following is the data frame which I am having with column name \'

相关标签:
2条回答
  • 2021-01-04 11:30

    Totally taking @MaxU's pattern!

    We can use pd.DataFrame.replace by setting the regex parameter to True and passing a dictionary of dictionaries that specifies the pattern and what to replace with for each column.

    pat = '|'.join([r'\b{}\b'.format(w) for w in remove_words])
    
    df.assign(new=df.replace(dict(string={pat: ''}), regex=True))
    
                   string              new
    0  abc stack overflow   stack overflow
    1              abc123           abc123
    2          def comedy           comedy
    3          definitely       definitely
    4            pls lkjh             lkjh
    5             pls1234          pls1234
    
    0 讨论(0)
  • 2021-01-04 11:34

    Try this:

    In [98]: pat = r'\b(?:{})\b'.format('|'.join(remove_words))
    
    In [99]: pat
    Out[99]: '\\b(?:abc|def|pls)\\b'
    
    In [100]: df['new'] = df['string'].str.replace(pat, '')
    
    In [101]: df
    Out[101]:
                   string              new
    0  abc stack overflow   stack overflow
    1              abc123           abc123
    2          def comedy           comedy
    3          definitely       definitely
    4            pls lkjh             lkjh
    5             pls1234          pls1234
    
    0 讨论(0)
提交回复
热议问题