how to use classwt in randomForest of R?

后端 未结 3 1007
南笙
南笙 2021-02-05 03:24

I have a highly imbalanced data set with target class instances in the following ratio 60000:1000:1000:50 (i.e. a total of 4 classes). I want to use randomFor

3条回答
  •  孤独总比滥情好
    2021-02-05 03:57

    classwt is correctly passed on to randomForest, check this example:

    library(randomForest)
    rf = randomForest(Species~., data = iris, classwt = c(1E-5,1E-5,1E5))
    rf
    
    #Call:
    # randomForest(formula = Species ~ ., data = iris, classwt = c(1e-05, 1e-05, 1e+05)) 
    #               Type of random forest: classification
    #                     Number of trees: 500
    #No. of variables tried at each split: 2
    #
    #        OOB estimate of  error rate: 66.67%
    #Confusion matrix:
    #           setosa versicolor virginica class.error
    #setosa          0          0        50           1
    #versicolor      0          0        50           1
    #virginica       0          0        50           0
    

    Class weights are the priors on the outcomes. You need to balance them to achieve the results you want.


    On strata and sampsize this answer might be of help: https://stackoverflow.com/a/20151341/2874779

    In general, sampsize with the same size for all classes seems reasonable. strata is a factor that's going to be used for stratified resampling, in your case you don't need to input anything.

提交回复
热议问题