How to replace a string using a dictionary containing multiple values for a key in python

后端 未结 1 1180
傲寒
傲寒 2021-01-26 03:16

I have dictionary with Word and its closest related words.

I want to replace the related words in the string with original word. Currently I am able replace words in th

相关标签:
1条回答
  • 2021-01-26 03:31

    I think you can replace by new dict with regex from this answer:

    d = {'Indian': 'India, Ind, ind.',
     'Restaurant': 'Hotel, Restrant, Hotpot',
     'Pub': 'Bar, Baar, Beer',
     '1888': '188, 188., 18'}
    
    d1 = {r'(?<!\S)'+ k.strip() + r'(?!\S)':k1 for k1, v1 in d.items() for k in v1.split(',')}
    
    df['col'] = df['col'].replace(d1, regex=True)
    print (df)
                            col
    0   North Indian Restaurant
    1   South Indian Restaurant
    2        Mexican Restaurant
    3       Italian  Restaurant
    4                  Cafe Pub
    5                 Irish Pub
    6               Maggiee Pub
    7           Jacky Craft Pub
    8               Bristo 1888
    9               Bristo 1888
    10              Bristo 1888
    

    EDIT (Function for the above code):

    def replace_words(d, col):
        d1={r'(?<!\S)'+ k.strip() + r'(?!\S)':k1 for k1, v1 in d.items() for k in v1.split(',')}
        df[col] = df[col].replace(d1, regex=True)
        return df[col]
    
    df['col'] = replace_words(d, 'col')
    

    EDIT1:

    If get errors like:

    regex error- missing ), unterminated subpattern at position 7

    is necessary escape regex values in keys:

    import re
    
    def replace_words(d, col):
        d1={r'(?<!\S)'+ re.escape(k.strip()) + r'(?!\S)':k1 for k1, v1 in d.items() for k in v1.split(',')}
        df[col] = df[col].replace(d1, regex=True)
        return df[col]
    
    df['col'] = replace_words(d, 'col')
    
    0 讨论(0)
提交回复
热议问题