How to manually select the features of the decision tree

╄→尐↘猪︶ㄣ 提交于 2021-01-29 07:49:18

问题


I need to be able to change the features (with the machine learning meaning) that are used to build the decision tree. Given the example of the Iris Dataset, I want to be able to select the Sepallength as the feature used in the root node and the Petallength as a feature used in the nodes of the first level, and so on.

I want to be clear, my aim is not to change the minimum sample split and the random state of the decision tree. But rather to select the features - the characteristics of the elements that are classified - and put them in some nodes of the decision tree.

The code should then be able to find the best threshold - range for each node - to generate the best split.

Here some general code about the tree generation.

from sklearn.tree import DecisionTreeClassifier
from sklearn.datasets import load_iris

clf = DecisionTreeClassifier(random_state=0)

iris = load_iris()

clf.fit(iris.data,iris.target)

Does any of you have ever done this?


回答1:


Does any of you have ever done this?

No, you are probably the first one!

Haha, but you can select it in several ways, you can also find it in the offical documentation: https://scikit-learn.org/stable/auto_examples/datasets/plot_iris_dataset.html

# import some data to play with
iris = datasets.load_iris()
X = iris.data[:, :2]  # we only take the first two features.
y = iris.target

then you are doing: clf.fit(X, y)

Ohter ways to do it are explained here: Selecting multiple columns in a pandas dataframe



来源:https://stackoverflow.com/questions/58233148/how-to-manually-select-the-features-of-the-decision-tree

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!