dictvectorizer

How to encode categorical features in sklearn?

流过昼夜 提交于 2019-12-21 05:29:10
问题 I have a dataset with 41 features [from 0 to 40 columns], of which 7 are categorical. This categorical set is divided in two subset: A subset of string type(the column-features 1, 2, 3) A subset of int type, in binary form 0 or 1 (the column-features 6, 11, 20, 21) Furthermore the column-features 1, 2 and 3 (of string type) have cardinality 3, 66 and 11 respectively. In this context I have to encode them to use support vector machine algorithm. This is the code that I have: import numpy as np