Strict regex in Pandas replace

前端 未结 1 1800
自闭症患者
自闭症患者 2021-01-22 18:07

I need to write a strict regular expression to replace certain values in my pandas dataframe. This is an issue that was raised after solving the questi

1条回答
  •  南方客
    南方客 (楼主)
    2021-01-22 18:35

    Add \b for word boundaries to each key of dict:

    d = {'UK': 'United Kingdom', 'LA': 'Los Angeles', 'NYC': 'New York City', 'NY' : 'New York'}
    
    data = {'Categories': ['animal','plant','object'],
        'Type': ['tree','dog','rock'],
            'Comment': ['The NYC tree is very big', 'NY The cat from the UK is small',
                        'The rock was found in LA.']
    }
    
    d = {r'\b' + k + r'\b':v for k, v in d.items()}
    
    df = pd.DataFrame(data)
    
    df['commentTest'] = df['Comment'].replace(d, regex=True)
    print (df)
      Categories                          Comment  Type  \
    0     animal         The NYC tree is very big  tree   
    1      plant  NY The cat from the UK is small   dog   
    2     object        The rock was found in LA.  rock   
    
                                             commentTest  
    0                 The New York City tree is very big  
    1  New York The cat from the United Kingdom is small  
    2                 The rock was found in Los Angeles.  
    

    0 讨论(0)
提交回复
热议问题