问题
I'm using the rpart
package to fit some models, like this:
fitmodel = function(formula, data, w) {
fit = rpart(formula, data, weights = w)
}
Call the custom function
fit = fitmodel(y ~ x1 + x2, data, w)
This causes the error:
Error in eval(expr, envir, enclos) : object 'w' not found
Then i decided to use
fitmodel = function(formula, data, w) {
data$w = w
fit = rpart(formula, data, weights = w)
}
This works, but there's another problem:
This will work
fit = fitmodel(y ~ x1 + x2, data, w)
This does not work
fit = fitmodel(y ~ ., data, w)
Error in eval(expr, envir, enclos) : object 'w' not found
What's the correct way to apply weights inside a custom function? Thanks!
回答1:
Hopefully someone else gives a more complete answer. The reason why rpart
can't find w
is that rpart
searches the environment that the formula is defined in for data, weights, etc. The formula is created in some environment most likely the GlobalEnv
and the w
is created within some other function. Changing the environment of the formula to the environment where w
is created with parent.frame
fixes that. rpart
can still find the data since the search path will always continue to the GlobalEnv
. I'm not sure why the sys.frame(sys.nframe())
works since the environments aren't the same but apparently w
is still somewhere on the search path
edit: sys.frame(sys.nframe())
seems to be the same as setting the environment of the forumla to the environment of the function rpart
is called in (foo3
in this example). In that case, rpart
looks for w
, data
, etc. in foo3
, then bar3
then the GlobalEnv
.
library(rpart)
data(iris)
bar <- function(formula, data) {
w <- rpois(nrow(iris), 1)
print(environment())
foo(formula, data, w)
}
foo <- function(formula, data, w) {
print(environment(formula))
fit <- rpart(formula, data, weights = w)
return(fit)
}
bar(I(Species == "versicolor") ~ ., data = iris)
## <environment: 0x1045b1a78>
## <environment: R_GlobalEnv>
## Error in eval(expr, envir, enclos) (from #2) : object 'w' not found
bar2 <- function(formula, data) {
w <- rpois(nrow(iris), 1)
print(environment())
foo2(formula, data, w)
}
foo2 <- function(formula, data, w) {
print(environment(formula))
environment(formula) <- parent.frame()
print(environment(formula))
fit <- rpart(formula, data, weights = w)
return(fit)
}
bar2(I(Species == "versicolor") ~ ., data = iris)
## <environment: 0x100bf5910>
## <environment: R_GlobalEnv>
## <environment: 0x100bf5910>
bar3 <- function(formula, data) {
w <- rpois(nrow(iris), 1)
print(environment())
foo3(formula, data, w)
}
foo3 <- function(formula, data, w) {
print(environment(formula))
environment(formula) <- environment() ## seems to be the same as sys.frame(sys.nframe())
print(environment(formula))
print(environment())
fit <- rpart(formula, data, weights = w)
return(fit)
}
bar3(I(Species == "versicolor") ~ ., data = iris)
## <environment: 0x104e11bb8>
## <environment: R_GlobalEnv>
## <environment: 0x104b4ff78>
## <environment: 0x104b4ff78>
回答2:
According to the rpart documentation (March 12, 2017, page 23, section 6.1), "Weights are not yet supported, and will be ignored if present."
https://cran.r-project.org/web/packages/rpart/vignettes/longintro.pdf
回答3:
I've managed to solve this using the code below, but i'm sure there's a better way:
The weak learner
fitmodel = function(formula, data, w) {
# just paste the weights into the data frame
data$w = w
rpart(formula, data, weights = w, control = rpart.control(maxdepth = 1))
}
The algorithm
ada.boost = function(formula, data, wl.FUN = fitmodel, test.data = NULL, M = 100) {
# Just rewrites the formula and get ride of any '.'
dep.var = all.vars(formula)[1]
vars = attr(terms(formula, data = data), "term.labels")
formula = as.formula(paste(dep.var, "~", paste(vars, collapse = "+")))
# ...more code
}
Now everything works!
来源:https://stackoverflow.com/questions/22258739/apply-weights-in-rpart-model-gives-error