I am using OneHotEncoder to encode few categorical variables (eg - Sex and AgeGroup). The resulting feature names from the encoder are like - \'x0_female\', \'x0_male\', \'x1_0.
column_name = encoder.get_feature_names(['Sex', 'AgeGroup'])
one_hot_encoded_frame = pd.DataFrame(train_X_encoded, columns= column_name)
You can pass the list with original column names to get_feature_names
:
encoder.get_feature_names(['Sex', 'AgeGroup'])
will return:
['Sex_female', 'Sex_male', 'AgeGroup_0', 'AgeGroup_15',
'AgeGroup_30', 'AgeGroup_45', 'AgeGroup_60', 'AgeGroup_75']
Thanks for a nice solution. @Nursnaaz The sparse matrix needs to convert into a dense matrix.
column_name = encoder.get_feature_names(['Sex', 'AgeGroup'])
one_hot_encoded_frame = pd.DataFrame(train_X_encoded.todense(), columns= column_name)