Right way to use lm in R

有些话、适合烂在心里 提交于 2019-12-08 07:16:25

问题


I do not have very clear idea of how to use functions like lm() that ask for a formula and a data.frame. On the web I red about different approach but sometimes R give us warnings and other stuff

Suppose for example a linear model where the output vector y is explained by the matrix X.

I red that the best way is to use a data.frame (expecially if we are going to use the predict function later).

In situation where the X is a matrix is this the best way to use lm?

n=100
p=20
n_new=50

X=matrix(rnorm(n*p),n,p)
Y=rnorm(n)
data=list("x"=X,"y"=Y)
l=lm(y~x,data)  

X_new=matrix(rnorm(n_new*p),n_new,p)
pred=predict(l,as.data.frame(X_new))

回答1:


How about:

l <- lm(y~.,data=data.frame(X,y=Y))
pred <- predict(l,data.frame(X_new))

In this case R constructs the column names (X1 ... X20) automatically, but when you use the y~. syntax you don't need to know them.

Alternatively, if you are always going to fit linear regressions based on a matrix, you can use lm.fit() and compute the predictions yourself using matrix multiplication: you have to use cbind(1,.) to add an intercept column.

fit <- lm.fit(cbind(1,X),Y)
all(coef(l)==fit$coefficients)  ## TRUE
pred <- cbind(1,X_new) %*% fit$coefficients

(You could also use cbind(1,X_new) %*% coef(l).) This is efficient, but it skips a lot of the error-checking steps, so use it with caution ...




回答2:


In a situation like the one you describe, you have no reason not to turn your matrix into a data frame. Try:

myData <- as.data.frame(cbind(Y, X))
l      <- lm(Y~., data=myData)


来源:https://stackoverflow.com/questions/20075822/right-way-to-use-lm-in-r

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!