lm

What is the difference between lm(offense$R ~ offense$OBP) and lm(R ~ OBP)?

送分小仙女□ 提交于 2019-11-29 14:12:00
I am trying to use R to create a linear model and use that to predict some values. The subject matter is baseball stats. If I do this: obp <- lm(offense$R ~ offense$OBP) predict(obp, newdata=data.frame(OBP=0.5), interval="predict") I get the error: Warning message: 'newdata' had 1 row but variables found have 20 rows. However, if I do this: attach(offense) obp <- lm(R ~ OBP) predict(obp, newdata=data.frame(OBP=0.5), interval="predict") It works as expected and I get one result. What is the difference between the two? If I just print OBP and offense$OBP, they look the same. In the first case,

Error in dataframe *tmp* replacement has x data has y

怎甘沉沦 提交于 2019-11-29 13:46:18
I'm a beginner in R. Here is a very simple code where I'm trying to save the residual term: # Create variables for child's EA: dat$cldeacdi <- rowMeans(dat[,c('cdcresp', 'cdcinv')],na.rm=T) dat$cldeacu <- rowMeans(dat[,c('cucresp', 'cucinv')],na.rm=T) # Create a residual score for child EA: dat$cldearesid <- resid(lm(cldeacu ~ cldeacdi, data = dat)) I'm getting the following message: Error in `$<-.data.frame`(`*tmp*`, cldearesid, value = c(-0.18608488908881, : replacement has 366 rows, data has 367 I searched for this error but couldn't find anything that could resolve this. Additionally, I've

Error in calling `lm` in a `lapply` with `weights` argument

喜欢而已 提交于 2019-11-29 13:38:34
I've encounter a weird behavior when calling lm within a lapply using the weights argument. My code consist of a list of formula on which I run a linear model that I call in lapply . So far it was working: dd <- data.frame(y = rnorm(100), x1 = rnorm(100), x2 = rnorm(100), x3 = rnorm(100), x4 = rnorm(100), wg = runif(100,1,100)) ls.form <- list( formula(y~x1+x2), formula(y~x3+x4), formula(y~x1|x2|x3), formula(y~x1+x2+x3+x4) ) res.no.wg <- lapply(ls.form, lm, data = dd) However, when I add the weights argument, I get a weird error: res.with.wg <- lapply(ls.form, lm, data = dd, weights = dd[,"wg"

Looping over combinations of regression model terms

烈酒焚心 提交于 2019-11-29 12:49:38
I'm running a regression in the form reg=lm(y ~ x1+x2+x3+z1,data=mydata) In the place of the last term, z1 , I want to loop through a set of different variables, z1 through z10 , running a regression for each with it as the last term. E.g. in second run I want to use reg=lm(y ~ x1+x2+x3+z2,data=mydata) in 3rd run: reg=lm(y ~ x1+x2+x3+z3,data=mydata) How can I automate this by looping through the list of z-variables? With this dummy data: dat1 <- data.frame(y = rpois(100,5), x1 = runif(100), x2 = runif(100), x3 = runif(100), z1 = runif(100), z2 = runif(100) ) You could get your list of two lm

model.matrix(): why do I lose control of contrast in this case

谁说胖子不能爱 提交于 2019-11-29 12:46:31
Suppose we have a toy data frame: x <- data.frame(x1 = gl(3, 2, labels = letters[1:3]), x2 = gl(3, 2, labels = LETTERS[1:3])) I would like to construct a model matrix # x1b x1c x2B x2C # 1 0 0 0 0 # 2 0 0 0 0 # 3 1 0 1 0 # 4 1 0 1 0 # 5 0 1 0 1 # 6 0 1 0 1 by: model.matrix(~ x1 + x2 - 1, data = x, contrasts.arg = list(x1 = contr.treatment(letters[1:3]), x2 = contr.treatment(LETTERS[1:3]))) but actually I get: # x1a x1b x1c x2B x2C # 1 1 0 0 0 0 # 2 1 0 0 0 0 # 3 0 1 0 1 0 # 4 0 1 0 1 0 # 5 0 0 1 0 1 # 6 0 0 1 0 1 # attr(,"assign") # [1] 1 1 1 2 2 # attr(,"contrasts") # attr(,"contrasts")$x1 #

R linear regression issue : lm.fit(x, y, offset = offset, singular.ok = singular.ok, …)

北城余情 提交于 2019-11-29 10:44:35
I try a regression with R. I have the following code with no problem in importing the CSV file dat <- read.csv('http://pastebin.com/raw.php?i=EWsLjKNN',sep=";") dat # OK Works fine Regdata <- lm(Y~.,na.action=na.omit, data=dat) summary(Regdata) However when I try a regression it's not working. I get an error message: Erreur dans lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : aucun cas ne contient autre chose que des valeurs manquantes (NA) All my CSV file are numbers and if a "cell" is empty I have the "NA" value. Some column are not empty and some other row are sometimes

summary dataframe from several multiple regression outputs

荒凉一梦 提交于 2019-11-29 08:51:09
I am doing multiple OLS regressions. I have used the following lm function: GroupNetReturnsStockPickers <- read.csv("GroupNetReturnsStockPickers.csv", header=TRUE, sep=",", dec=".") ModelGroupNetReturnsStockPickers <- lm(StockPickersNet ~ Mkt.RF+SMB+HML+WML, data=GroupNetReturnsStockPickers) names(GroupNetReturnsStockPickers) summary(ModelGroupNetReturnsStockPickers) Which gives me the summary output of: Call: lm(formula = StockPickersNet ~ Mkt.RF + SMB + HML + WML, data = GroupNetReturnsStockPickers) Residuals: Min 1Q Median 3Q Max -0.029698 -0.005069 -0.000328 0.004546 0.041948 Coefficients:

How to correctly `dput` a fitted linear model (by `lm`) to an ASCII file and recreate it later?

那年仲夏 提交于 2019-11-29 07:03:44
I want to persist a lm object to a file and reload it into another program. I know I can do this by writing/reading a binary file via saveRDS / readRDS , but I'd like to have an ASCII file instead of a binary file. At a more general level, I'd like to know why my idioms for reading in dput output in general is not behaving as I'd expect. Below are examples of making a simple fit, and successful and unsuccessful recreations of the model: dat_train <- data.frame(x=1:4, z=c(1, 2.1, 2.9, 4)) fit <- lm(z ~ x, dat_train) rm(dat_train) # Just to make sure fit is not dependent upon `dat_train

modify lm or loess function to use it within ggplot2's geom_smooth

亡梦爱人 提交于 2019-11-29 06:36:30
I need to modify the lm (or eventually loess ) function so I can use it in ggplot2's geom_smooth (or stat_smooth ). For example, this is how stat_smooth is used normally: > qplot(data=diamonds, carat, price, facets=~clarity) + stat_smooth(method='lm')` I would like to define a custom lm2 function to use as value for the method parameter in stat_smooth , so I can customize its behaviour. > lm2 <- function(formula, data, ...) { print(head(data)) return(lm(formula, data, ...)) } > qplot(data=diamonds, carat, price, facets=~clarity) + stat_smooth(method='lm2') Note that I have used method='lm2' as

Using predict to find values of non-linear model

冷暖自知 提交于 2019-11-29 04:37:54
I'm trying the next code to try to see if predict can help me to find the values of the dependent variable for a polynomial of order 2, in this case it is obvious y=x^2: x <- c(1, 2, 3, 4, 5 , 6) y <- c(1, 4, 9, 16, 25, 36) mypol <- lm(y ~ poly(x, 2, raw=TRUE)) > mypol Call: lm(formula = y ~ poly(x, 2, raw = TRUE)) Coefficients: (Intercept) poly(x, 2, raw = TRUE)1 poly(x, 2, raw = TRUE)2 0 0 1 If I try to find the value of x=7, I get this: > predict(mypol, 7) Error in eval(predvars, data, env) : not that many frames on the stack What am I doing wrong? If you read the help for predict.lm , you