Apply weights in rpart model gives error

|▌冷眼眸甩不掉的悲伤 提交于 2019-12-14 02:28:28

问题


I'm using the rpart package to fit some models, like this:

fitmodel = function(formula, data, w) {

    fit = rpart(formula, data, weights = w)
}

Call the custom function

fit = fitmodel(y ~ x1 + x2, data, w)

This causes the error:

Error in eval(expr, envir, enclos) : object 'w' not found

Then i decided to use

fitmodel = function(formula, data, w) {

    data$w = w
    fit = rpart(formula, data, weights = w)
}

This works, but there's another problem:

This will work

fit = fitmodel(y ~ x1 + x2, data, w)

This does not work

fit = fitmodel(y ~ ., data, w)

Error in eval(expr, envir, enclos) : object 'w' not found

What's the correct way to apply weights inside a custom function? Thanks!


回答1:


Hopefully someone else gives a more complete answer. The reason why rpart can't find w is that rpart searches the environment that the formula is defined in for data, weights, etc. The formula is created in some environment most likely the GlobalEnv and the w is created within some other function. Changing the environment of the formula to the environment where w is created with parent.frame fixes that. rpart can still find the data since the search path will always continue to the GlobalEnv. I'm not sure why the sys.frame(sys.nframe()) works since the environments aren't the same but apparently w is still somewhere on the search path

edit: sys.frame(sys.nframe()) seems to be the same as setting the environment of the forumla to the environment of the function rpart is called in (foo3 in this example). In that case, rpart looks for w, data, etc. in foo3, then bar3 then the GlobalEnv.

library(rpart)
data(iris)

bar <- function(formula, data) {
   w <- rpois(nrow(iris), 1)
   print(environment())
   foo(formula, data, w)
}

foo <- function(formula, data, w) {
  print(environment(formula))
  fit <- rpart(formula, data, weights = w)
  return(fit)
}


bar(I(Species == "versicolor") ~ ., data = iris)
## <environment: 0x1045b1a78>
## <environment: R_GlobalEnv>
## Error in eval(expr, envir, enclos) (from #2) : object 'w' not found


bar2 <- function(formula, data) {
  w <- rpois(nrow(iris), 1)
  print(environment())
  foo2(formula, data, w)
}

foo2 <- function(formula, data, w) {
  print(environment(formula))
  environment(formula) <- parent.frame()
  print(environment(formula))
  fit <- rpart(formula, data, weights = w)
  return(fit)
}

bar2(I(Species == "versicolor") ~ ., data = iris)
## <environment: 0x100bf5910>
## <environment: R_GlobalEnv>
## <environment: 0x100bf5910>


bar3 <- function(formula, data) {
  w <- rpois(nrow(iris), 1)
  print(environment())
  foo3(formula, data, w)
}

foo3 <- function(formula, data, w) {
  print(environment(formula))
  environment(formula) <- environment() ## seems to be the same as sys.frame(sys.nframe())
  print(environment(formula))
  print(environment())
  fit <- rpart(formula, data, weights = w)
  return(fit)
}

bar3(I(Species == "versicolor") ~ ., data = iris)
## <environment: 0x104e11bb8>                                                                                                                                                                                                                 
## <environment: R_GlobalEnv>                                                                                                                                                                                                                 
## <environment: 0x104b4ff78>                                                                                                                                                                                                                 
## <environment: 0x104b4ff78>



回答2:


According to the rpart documentation (March 12, 2017, page 23, section 6.1), "Weights are not yet supported, and will be ignored if present."

https://cran.r-project.org/web/packages/rpart/vignettes/longintro.pdf




回答3:


I've managed to solve this using the code below, but i'm sure there's a better way:

The weak learner

fitmodel = function(formula, data, w) {

    # just paste the weights into the data frame
    data$w = w
    rpart(formula, data, weights = w, control = rpart.control(maxdepth = 1))
}

The algorithm

ada.boost = function(formula, data, wl.FUN = fitmodel, test.data = NULL, M = 100) {

    # Just rewrites the formula and get ride of any '.'
     dep.var = all.vars(formula)[1]
     vars = attr(terms(formula, data = data), "term.labels")
     formula = as.formula(paste(dep.var, "~", paste(vars, collapse = "+")))


    # ...more code
}

Now everything works!



来源:https://stackoverflow.com/questions/22258739/apply-weights-in-rpart-model-gives-error

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!