Populating Pandas DataFrame column based on dictionary of regex

后端 未结 2 868
清歌不尽
清歌不尽 2021-01-20 13:43

I have a dataframe like the following:

    GE    GO
1   AD    Weiss
2   KI    Ruby
3   OH    Port
4   ER    Rose
5   KI    Rose
6   JJ    Weiss
7   OH    7UP         


        
2条回答
  •  小蘑菇
    小蘑菇 (楼主)
    2021-01-20 14:02

    One option is to make use of re module with a map on the GO column:

    import re
    df['OUT'] = df.GO.map(lambda x: next(Dic[k] for k in Dic if re.search(k, x)))
    df
    

    This raises error if none of the pattern matches the string. If there are cases where string doesn't match any pattern, you can write a custom function to capture the exception and return None:

    import re
    def findCat(x):
        try:
            return next(Dic[k] for k in Dic if re.search(k, x))
        except:
            return None
    
    df['OUT'] = df.GO.map(findCat)
    df
    

提交回复
热议问题