pandas-python dataframe update a column

后端 未结 3 1415
隐瞒了意图╮
隐瞒了意图╮ 2021-01-21 23:52

Say I have a list BRANDS that contains brand names:

BRANDS = [\'Samsung\', \'Apple\', \'Nike\', .....]

Dataframe A has following structure

3条回答
  •  时光取名叫无心
    2021-01-22 00:19

    One approach is to use apply():

    import pandas as pd
    BRANDS = ['Samsung', 'Apple', 'Nike']
    
    def get_brand_name(row):
        if ~pd.isnull(row['brand_name']):
            # don't do anything if brand_name is not null
            return row['brand_name']
    
        item_title = row['item_title']
        title_words = map(str.title, item_title.split())
        for tw in title_words:
            if tw in BRANDS:
                # return first 'match'
                return tw
        # default return None
        return None
    
    df['brand_name'] = df.apply(lambda x: get_brand_name(x), axis=1)
    print(df)
    #   row     item_title brand_name
    #0    1       Apple 6S      Apple
    #1    2  Nike BB Shoes       Nike
    #2    3     Samsung TV    Samsung
    #3    4      Used bike       None
    

    Notes

    • I converted the tokenized title to title-case using str.title() because that's how you defined BRANDS.
    • If you have a lot of brands, it's recommended to use a set instead of a list because lookups will be faster. However, this won't work if you care about order.

提交回复
热议问题