model-comparison | 易学教程

Best function to compare caret model objects

阅读更多关于 Best function to compare caret model objects

问题 I have a number of caret model objects using the same data and tuning parameters. For a sanity check I want to see if each method gives me the same model object. (This is all part of a broader plan to run parallel processing and ensure my models are the same.) For example, below, I train 2 different models and want to compare. When I compare the caret objects it returns FALSE. > library(caret) > > set.seed(0) > myControl <- trainControl(method='cv', index=createFolds(iris$Species)) > > set

What is a threshold in a Precision-Recall curve?

阅读更多关于 What is a threshold in a Precision-Recall curve?

问题 I am aware of the concept of Precision as well as the concept of Recall. But I am finding it very hard to understand the idea of a 'threshold' which makes any P-R curve possible. Imagine I have a model to build that predicts the re-occurrence (yes or no) of cancer in patients using some decent classification algorithm on relevant features. I split my data for training and testing. Lets say I trained the model using the train data and got my Precision and Recall metrics using the test data.

What is a threshold in a Precision-Recall curve?

阅读更多关于 What is a threshold in a Precision-Recall curve?

Subsetting in dredge (MuMIn) - must include interaction if main effects are present

阅读更多关于 Subsetting in dredge (MuMIn) - must include interaction if main effects are present

问题 I'm doing some exploratory work where I use dredge{MuMIn}. In this procedure there are two variables that I want to set to be allowed together ONLY when the interaction between them is present, i.e. they can not be present together only as main effects. Using sample data: I want to dredge the model fm1 (disregarding that it probably doesn't make sense). If the variables GNP and Population appear together, they must also include the interaction between them. require(stats); require(graphics) #

Model comparison for breakpoint time series model in R strucchange

阅读更多关于 Model comparison for breakpoint time series model in R strucchange

问题 I want to test whether a time series contains structural changes or not. Using this simulated example creates a series with two breaks after 30 and 80 observations. set.seed(42) sim_data = data.frame(outcome = c(rnorm(30, 10, 1), rnorm(50, 20, 2), rnorm(20, 45, 1))) sim_ts = ts(data = sim_data, start = c(2010, 1), frequency = 12) plot(sim_ts) I use the strucchange R package to determine the number (if any) of break points and model these: library("strucchange") break_points = breakpoints(sim

AIC different between biglm and lm

阅读更多关于 AIC different between biglm and lm

问题 I have been trying to use biglm to run linear regressions on a large dataset (approx 60,000,000 lines). I want to use AIC for model selection. However I discovered when playing with biglm on smaller datasets that the AIC variables returned by biglm are different from those returned by lm. This even applies to the example in the biglm help. data(trees) ff<-log(Volume)~log(Girth)+log(Height) chunk1<-trees[1:10,] chunk2<-trees[11:20,] chunk3<-trees[21:31,] library(biglm) a <- biglm(ff,chunk1) a

Model selection using glmulti

阅读更多关于 Model selection using glmulti

问题 I am attempting to run glmulti to test all possible subsets for model selection. The following is the code that I am trying to use. lmer.glmulti<-function(formula, data, random="", ...){ lmer(paste(deparse(formula),random),data=data, REML=FALSE,...) } glmulti <- glmulti(formula(lmer(transLOT~DielEnd+TidalHeight+Pier+PercentIllumination+WT+BP+Anglers+(1|Transmitter), data=RESIDENCY_FOR_R), fixed.only=TRUE), data=RESIDENCY_FOR_R, level = 1, method = "h", crit = "bic", confsetsize = 5, plotty =

AIC with weighted nonlinear regression (nls)

阅读更多关于 AIC with weighted nonlinear regression (nls)

问题 I encounter some discrepancies when comparing the deviance of a weighted and unweigthed model with the AIC values. A general example (from ‘nls’): DNase1 <- subset(DNase, Run == 1) fm1DNase1 <- nls(density ~ SSlogis(log(conc), Asym, xmid, scal), DNase1) This is the unweighted fit, in the code of ‘nls’ one can see that ‘nls’ generates a vector wts <- rep(1, n) . Now for a weighted fit: fm2DNase1 <- nls(density ~ SSlogis(log(conc), Asym, xmid, scal), DNase1, weights = rep(1:8, each = 2)) in

Model selection using glmulti

阅读更多关于 Model selection using glmulti

I am attempting to run glmulti to test all possible subsets for model selection. The following is the code that I am trying to use. lmer.glmulti<-function(formula, data, random="", ...){ lmer(paste(deparse(formula),random),data=data, REML=FALSE,...) } glmulti <- glmulti(formula(lmer(transLOT~DielEnd+TidalHeight+Pier+PercentIllumination+WT+BP+Anglers+(1|Transmitter), data=RESIDENCY_FOR_R), fixed.only=TRUE), data=RESIDENCY_FOR_R, level = 1, method = "h", crit = "bic", confsetsize = 5, plotty = F, report = F, fitfunc = lmer.glmulti, random="+(1|Transmitter)", intercept=TRUE) A problem arises with

AIC with weighted nonlinear regression (nls)

阅读更多关于 AIC with weighted nonlinear regression (nls)

I encounter some discrepancies when comparing the deviance of a weighted and unweigthed model with the AIC values. A general example (from ‘nls’): DNase1 <- subset(DNase, Run == 1) fm1DNase1 <- nls(density ~ SSlogis(log(conc), Asym, xmid, scal), DNase1) This is the unweighted fit, in the code of ‘nls’ one can see that ‘nls’ generates a vector wts <- rep(1, n) . Now for a weighted fit: fm2DNase1 <- nls(density ~ SSlogis(log(conc), Asym, xmid, scal), DNase1, weights = rep(1:8, each = 2)) in which I assign increasing weights for each of the 8 concentrations with 2 replicates. Now with deviance I