Pandas get dummies() for numeric categorical data

后端 未结 2 1126
遥遥无期
遥遥无期 2021-01-13 01:25

I have 2 columns:

  • Sex (with categorical values of type string as \'male\' and \'female\')
  • Class (with categorical values of type integer as 1 to 10)
相关标签:
2条回答
  • 2021-01-13 01:40

    You can convert values to strings:

    df1 = pd.get_dummies(df.astype(str))
    
    0 讨论(0)
  • 2021-01-13 01:56

    If you don't want to convert your data, you can use 'columns' argument in get_dummies. Here is quick walkthrough:

    Here is the data frame reproduced per your description:

    sex_labels = ['male', 'female']
    sex_col = [sex_labels[i%2] for i in range(10)]
    class_col = [i for i in range(10)]
    df = pd.DataFrame({'sex':sex_cols, 'class':class_col})
    df.sex = pd.Categorical(df.sex)
    

    The dtypes are:

    print(df.dtypes)
    sex      category
    class       int64
    dtype: object
    

    Apply get_dummies:

    df = pd.get_dummies(df, columns=['sex', 'class'])
    

    Verify:

    print(df.columns)

    Output:

    Index(['sex_female', 'sex_male', 'class_0',
    'class_1','class_2','class_3','class_4','class_5',
    'class_6','class_7','class_8','class_9'],dtype='object')
    

    Per the docs at, https://pandas.pydata.org/pandasdocs/stable/reference/api/pandas.get_dummies.html,

    If columns is None then all the columns with object or category dtype will be converted

    This is the reason you only see dummies for sex column and not for class.

    Hope this helps. Happy learning!

    Note: Tested with pandas version '0.25.2'

    0 讨论(0)
提交回复
热议问题