Converting a Pandas Dataframe column into one hot labels

前端 未结 4 1736
余生分开走
余生分开走 2020-12-20 16:08

I have a pandas dataframe similar to this:

  Col1   ABC
0  XYZ    A
1  XYZ    B
2  XYZ    C

By using the pandas get_dummies()

相关标签:
4条回答
  • 2020-12-20 16:31

    If you have a pd.DataFrame like this:

    >>> df
      Col1  A  B  C
    0  XYZ  1  0  0
    1  XYZ  0  1  0
    2  XYZ  0  0  1
    

    You can always do something like this:

    >>> df.apply(lambda s: list(s[1:]), axis=1)
    0    [1, 0, 0]
    1    [0, 1, 0]
    2    [0, 0, 1]
    dtype: object
    

    Note, this is essentially a for-loop on the rows. Note, columns do not have list data-types, they must be object, which will make your data-frame operations not able to take advantage of the speed benefits of numpy.

    0 讨论(0)
  • 2020-12-20 16:32

    Here is an example of using sklearn.preprocessing.LabelBinarizer:

    In [361]: from sklearn.preprocessing import LabelBinarizer
    
    In [362]: lb = LabelBinarizer()
    
    In [363]: df['new'] = lb.fit_transform(df['ABC']).tolist()
    
    In [364]: df
    Out[364]:
      Col1 ABC        new
    0  XYZ   A  [1, 0, 0]
    1  XYZ   B  [0, 1, 0]
    2  XYZ   C  [0, 0, 1]
    

    Pandas alternative:

    In [370]: df['new'] = df['ABC'].str.get_dummies().values.tolist()
    
    In [371]: df
    Out[371]:
      Col1 ABC        new
    0  XYZ   A  [1, 0, 0]
    1  XYZ   B  [0, 1, 0]
    2  XYZ   C  [0, 0, 1]
    
    0 讨论(0)
  • 2020-12-20 16:40

    You can just use tolist():

    df['ABC'] = pd.get_dummies(df.ABC).values.tolist()
    
      Col1        ABC
    0  XYZ  [1, 0, 0]
    1  XYZ  [0, 1, 0]
    2  XYZ  [0, 0, 1]
    
    0 讨论(0)
  • 2020-12-20 16:48

    if you have a data-frame df with categorical column ABC then you could use to create a new column of one-hot vectors

    df['new_column'] = list(pandas.get_dummies(df['AB]).get_values())
    
    0 讨论(0)
提交回复
热议问题