lm

How can I specify a relationship between parameter estimates in lm?

喜你入骨 提交于 2019-12-05 02:04:07
问题 Using lm, I would like to fit the model: y = b0 + b1*x1 + b2*x2 + b1*b2*x1*x2 My question is: How can I specify that the coefficient of the interaction should equal the multiplication of the coefficients the main effects? I've seen that to set the coefficient to a specific value you can use offset() and I() but I don't know how to specify a relationship between coefficient. Here is a simple simulated dataset: n <- 50 # Sample size x1 <- rnorm(n, 1:n, 0.5) # Independent variable 1 x2 <- rnorm

plot.lm(): extracting numbers labelled in the diagnostic Q-Q plot

这一生的挚爱 提交于 2019-12-04 23:32:38
For the simple example below, you can see that there are certain points that are identified in the ensuing plots. How can I extract the row numbers identified in these plots, especially the Normal Q-Q plot? set.seed(2016) maya <- data.frame(rnorm(100)) names(maya)[1] <- "a" maya$b <- rnorm(100) mara <- lm(b~a, data=maya) plot(mara) I tried using str(mara) to see if I could find a list there, but I can't see any of the numbers from the Normal Q-Q plot there. Thoughts? I have edited your question using set.seed(2016) for reproducibility. To answer your question, I need to explain how to produce

How does plot.lm() determine outliers for residual vs fitted plot?

限于喜欢 提交于 2019-12-04 19:50:57
问题 How does plot.lm() determine what points are outliers (that is, what points to label) for residual vs fitted plot? The only thing I found in the documentation is this: Details sub.caption—by default the function call—is shown as a subtitle (under the x-axis title) on each plot when plots are on separate pages, or as a subtitle in the outer margin (if any) when there are multiple plots per page. The ‘Scale-Location’ plot, also called ‘Spread-Location’ or ‘S-L’ plot, takes the square root of

Rolling regression return multiple objects

允我心安 提交于 2019-12-04 19:17:45
I am trying to build a rolling regression function based on the example here , but in addition to returning the predicted values, I would like to return the some rolling model diagnostics (i.e. coefficients, t-values, and mabye R^2). I would like the results to be returned in discrete objects based on the type of results. The example provided in the link above sucessfully creates thr rolling predictions, but I need some assistance packaging and writing out the rolling model diagnostics: In the end, I would like the function to return three (3) objects: Predictions Coefficients T values R^2

Applying lm() and predict() to multiple columns in a data frame

淺唱寂寞╮ 提交于 2019-12-04 17:26:19
I have an example dataset below. train<-data.frame(x1 = c(4,5,6,4,3,5), x2 = c(4,2,4,0,5,4), x3 = c(1,1,1,0,0,1), x4 = c(1,0,1,1,0,0), x5 = c(0,0,0,1,1,1)) Suppose I want to create separate models for column x3 , x4 , x5 based on column x1 and x2 . For example lm1 <- lm(x3 ~ x1 + x2) lm2 <- lm(x4 ~ x1 + x2) lm3 <- lm(x5 ~ x1 + x2) I want to then take these models and apply them to a testing set using predict, and then create a matrix that has each model outcome as a column. test <- data.frame(x1 = c(4,3,2,1,5,6), x2 = c(4,2,1,6,8,5)) p1 <- predict(lm1, newdata = test) p2 <- predict(lm2,

Plotting one predictor of a model that has several predictors with ggplot

时间秒杀一切 提交于 2019-12-04 17:04:05
Here is an typical example of linear model and a ggplot: require(ggplot2) utils::data(anorexia, package = "MASS") anorex.1 <- glm(Postwt ~ Prewt + Treat + offset(Prewt), family = gaussian, data = anorexia) coef(anorex.1) (Intercept) Prewt TreatCont TreatFT 49.7711090 -0.5655388 -4.0970655 4.5630627 ggplot(anorexia, aes(y=Postwt, x=Prewt)) + geom_point() + geom_smooth(method='lm', se=F) My problem is that the regression that is made by geom_smooth(...) is not the same model than anorex.1 but is: coef(lm(Postwt ~ Prewt, data=anorexia)) (Intercept) Prewt 42.7005802 0.5153804 How can I plot the

Get all models from leaps regsubsets

☆樱花仙子☆ 提交于 2019-12-04 16:54:04
I used regsubsets to search for models. Is it possible to automatically create all lm from the list of parameter selections? library(leaps) leaps<-regsubsets(y ~ x1 + x2 + x3, data, nbest=1, method="exhaustive") summary(leaps)$which (Intercept) x1 x2 x3 1 TRUE FALSE FALSE TRUE 2 TRUE FALSE TRUE TRUE 3 TRUE TRUE TRUE TRUE Now i would manually do model_1 <- lm(y ~ x3) and so on. How can this be automated to have them in a list? I don't know why you want a list of all models. summary and coef methods should serve you well. But I will first answer your question from a pure programming aspect, then

fixed effects in R: plm vs lm + factor()

浪尽此生 提交于 2019-12-04 16:04:00
I'm trying to run a fixed effects regression model in R. I want to control for heterogeneity in variables C and D (neither are a time variable). I tried the following two approaches: 1) Use the plm package: Gives me the following error message formula = Y ~ A + B + C + D reg = plm(formula, data= data, index=c('C','D'), method = 'within') duplicate couples (time-id)Error in pdim.default(index[[1]], index[[2]]) : I also tried creating first a panel using data_p = pdata.frame(data,index=c('C','D')) But I have repeated observations in both columns. 2) Use factor() and lm: works well formula = Y ~

Updating a linear regression model with update and purrr

牧云@^-^@ 提交于 2019-12-04 13:46:51
I want to update a lm -model using the update -function inside a map -call, but this throws the following error: mtcars %>% group_by(cyl) %>% nest() %>% mutate(lm1 = map(data, ~lm(mpg ~ wt, data = .x)), lm2 = map(lm1, ~update(object = .x, formula = .~ . + hp))) Error in mutate_impl(.data, dots) : Evaluation error: cannot coerce class ""lm"" to a data.frame. Can anyone help me with this problem? I am confused about this error, because e.g. this works totally fine: mtcars %>% group_by(cyl) %>% nest() %>% mutate(lm1 = map(data, ~lm(mpg ~ wt, data = .x)), lm2 = map_dbl(lm1, ~coefficients(.x)[1]))

Predict.glm not predicting missing values in response

核能气质少年 提交于 2019-12-04 11:23:03
问题 For some reason, when I specify glms (and lm's too, it turns out), R is not predicting missing values of the data. Here is an example: y = round(runif(50)) y = c(y,rep(NA,50)) x = rnorm(100) m = glm(y~x, family=binomial(link="logit")) p = predict(m,na.action=na.pass) length(p) y = round(runif(50)) y = c(y,rep(NA,50)) x = rnorm(100) m = lm(y~x) p = predict(m) length(p) The length of p should be 100, but its 50. The weird thing is that I have other predicts in the same script that do predict