glm

Why is caret train taking up so much memory?

喜欢而已 提交于 2019-12-29 14:19:04
问题 When I train just using glm , everything works, and I don't even come close to exhausting memory. But when I run train(..., method='glm') , I run out of memory. Is this because train is storing a lot of data for each iteration of the cross-validation (or whatever the trControl procedure is)? I'm looking at trainControl and I can't find how to prevent this...any hints? I only care about the performance summary and maybe the predicted responses. (I know it's not related to storing data from

Why is caret train taking up so much memory?

倾然丶 夕夏残阳落幕 提交于 2019-12-29 14:18:07
问题 When I train just using glm , everything works, and I don't even come close to exhausting memory. But when I run train(..., method='glm') , I run out of memory. Is this because train is storing a lot of data for each iteration of the cross-validation (or whatever the trControl procedure is)? I'm looking at trainControl and I can't find how to prevent this...any hints? I only care about the performance summary and maybe the predicted responses. (I know it's not related to storing data from

modify glm function to adopt user-specified link function in R

一笑奈何 提交于 2019-12-28 03:39:06
问题 In glm in R, the default link functions for the Gamma family are inverse , identity and log . Now for my particular question, I need to use gamma regression with response Y and a modified link function in the form of log(E(Y)-1)) . Thus, I consider modifying some glm -related functions in R. There are several functions that may be relevant, and I am seeking help for anyone who had previous experience in doing this. For example, the functions Gamma is defined as function (link = "inverse") {

How to remove correlated variables from GLM in R

五迷三道 提交于 2019-12-25 09:49:13
问题 I am trying to exclude correlated variables from GLModel. Firstly, I calculate correlation matrix. Afterwards, I would like to implement it into combn function in some way to exclude the variables (column headers) that are correlated. At this point I fail - I am not able to incorporate it in combn function so that it worked and correlated variables were excluded. Here is the link for data I use: https://drive.google.com/open?id=0B5IgiR_svnKcZkxHeTJXTm9jUjQ Here is the code I am trying to make

How to use constructed formula with glm.mids

故事扮演 提交于 2019-12-24 17:15:49
问题 Working with a large number of variables and addressing them with constructed formula (via paste0() ) using variables passed to functions. I have stumbled across a problem/bug I cannot figure out. Easiest explained with a toy example: library(mice) imp2 = mice(nhanes) # So both these models run fine: mod1 <- glm(bmi ~ hyp + age, data=nhanes) mod1.im <- with(imp2, glm(bmi ~ hyp + age)) # However if I try to pass a formula to glm() in the with() I get an error formula = bmi ~ hyp + age mod2 <-

Elegantly convert rate summary rows into long binary-response rows?

孤人 提交于 2019-12-24 06:46:08
问题 Background: I am running a little A/B test, with 2x2 factors (foreground's black and background's white, off-color vs normal color), and Analytics reports the number of hits for each of the 4 conditions and at what rate they 'converted' (a binary variable, which I define as spending at least 40 seconds on page). It's easy enough to do a little editing and get in a nice R dataframe: rates <- read.csv(stdin(),header=TRUE) Black,White,N,Rate TRUE,FALSE,512,0.2344 FALSE,TRUE,529,0.2098 TRUE,TRUE

Odds ratio for ordinal variables from PROC GENMOD

生来就可爱ヽ(ⅴ<●) 提交于 2019-12-24 06:03:35
问题 I have a set of data where I am creating a logistic regression model, looking at the odds of a binary outcome variable (Therapy), with Stage as an ordinal explanatory variable (0,1,2,3,4). A1c is a continuous variable. Because each patient has two eyes, I must use the repeated subject = patientID(EyeID) statement. The following is my code: PROC GENMOD data=new descend; class patientID EyeID Stage (param = ordinal) Therapy (ref ="0") Gender(ref="M") Ethnic agegroup/ PARAM=ref; model Therapy =

Odds ratio for ordinal variables from PROC GENMOD

给你一囗甜甜゛ 提交于 2019-12-24 06:03:32
问题 I have a set of data where I am creating a logistic regression model, looking at the odds of a binary outcome variable (Therapy), with Stage as an ordinal explanatory variable (0,1,2,3,4). A1c is a continuous variable. Because each patient has two eyes, I must use the repeated subject = patientID(EyeID) statement. The following is my code: PROC GENMOD data=new descend; class patientID EyeID Stage (param = ordinal) Therapy (ref ="0") Gender(ref="M") Ethnic agegroup/ PARAM=ref; model Therapy =

How do you get R's null and residual deviance equivalents in Matlab fitglm?

久未见 提交于 2019-12-23 17:32:15
问题 In R, after fitting a glm you can get summary info containing the residual deviance and null deviance which tells you how good your model is compared to the model with just the intercept term, for the example model: model <- glm(formula = am ~ mpg + qsec, data=mtcars, family=binomial) we have: > summary(model) ... Null deviance: 43.2297 on 31 degrees of freedom Residual deviance: 7.5043 on 29 degrees of freedom AIC: 13.504 ... In Matlab, when you use fitglm you return an object of

logit binomial regression with clustered standard errors

|▌冷眼眸甩不掉的悲伤 提交于 2019-12-23 17:26:59
问题 I´m trying to replicate a glm estimation from stata: sysuse auto logit foreign weight mpg, cluster(rep78) Logistic regression Number of obs = 69 Wald chi2(2) = 31.57 Prob > chi2 = 0.0000 Log pseudolikelihood = -22.677963 Pseudo R2 = 0.4652 (Std. Err. adjusted for 5 clusters in rep78) ------------------------------------------------------------------------------ | Robust foreign | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+-------------------------------------------------------