Create pandas dataframe with multiple dataframes

后端 未结 2 2024
死守一世寂寞
死守一世寂寞 2021-01-23 17:06

I\'ve a csv file like this:

Fruit_Type;Fruit_Color;Fruit_Description
Apple;Green,Red,Yellow;Just an apple
Banana;Green,Yellow;Just a Banana
Orange;Red,Yellow;Jus         


        
相关标签:
2条回答
  • 2021-01-23 17:36

    I suggest use str.get_dummies:

    df = df.join(df.pop('Fruit_Color').str.get_dummies(','))
    print (df)
      Fruit_Type Fruit_Description  Green  Red  Yellow
    0      Apple     Just an apple      1    1       1
    1     Banana     Just a Banana      1    0       1
    2     Orange    Just an Orange      0    1       1
    3      Grape      Just a Grape      0    0       0
    
    0 讨论(0)
  • 2021-01-23 17:39

    You can create the columns using assign:

    df.assign(
       green=lambda d: d['Fruit_color'].str.contains('Green', case=True),
       red=lambda d: d['Fruit_color'].str.contains('Red', case=True),
       yellow=lambda d: d['Fruit_color'].str.contains('Yellow', case=True),
    )
    

    This results in a new dataframe with three additional columns of Booleans, namely "green", "red" and "yellow".

    To detect a row with no known colour, you can also assign other_color=lambda d: ~(d['green'] | d['red'] | d['yellow']).

    Another possibility is to use pandas.concat to concatenate multiple dataframes, but it's less elegant than the above solution.

    0 讨论(0)
提交回复
热议问题