pandas-python dataframe update a column

后端未结

关注

 3  1415

隐瞒了意图╮ 2021-01-21 23:52

Say I have a list BRANDS that contains brand names:

BRANDS = [\'Samsung\', \'Apple\', \'Nike\', .....]

Dataframe A has following structure

3条回答

时光取名叫无心 (楼主)

2021-01-22 00:19

One approach is to use apply():

import pandas as pd
BRANDS = ['Samsung', 'Apple', 'Nike']

def get_brand_name(row):
    if ~pd.isnull(row['brand_name']):
        # don't do anything if brand_name is not null
        return row['brand_name']

    item_title = row['item_title']
    title_words = map(str.title, item_title.split())
    for tw in title_words:
        if tw in BRANDS:
            # return first 'match'
            return tw
    # default return None
    return None

df['brand_name'] = df.apply(lambda x: get_brand_name(x), axis=1)
print(df)
#   row     item_title brand_name
#0    1       Apple 6S      Apple
#1    2  Nike BB Shoes       Nike
#2    3     Samsung TV    Samsung
#3    4      Used bike       None

Notes

I converted the tokenized title to title-case using str.title() because that's how you defined BRANDS.
If you have a lot of brands, it's recommended to use a set instead of a list because lookups will be faster. However, this won't work if you care about order.

0 讨论(0)

查看其它3个回答