问题
I have a requirement where I need to group my categorical variables (having more than 5 category values) into 5 groups based on their association with my continuous variable. To achieve this I am using rpart with "annova" method.
So for example my categorical variable is type having codes 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15 so I want to have 5 groups of this variable. After running the tree inorder to have only 5 groups I need to prune the tree. One way I tried is to use the nsplit from cptable but, nsplit of 5 might give me 7-8 leaves and similarly nsplit of 4 might give me 5-6 leaves.
I was looking for an option by which when I prune I get only 5 leaves which would act as my 5 groups.
Can someone please suggest how I can achieve this by using rpart.
Thank you !!
来源:https://stackoverflow.com/questions/49228786/rpart-find-number-of-leaves-that-a-cp-value-to-pruning-a-tree-would-return