lm

Shortcut using lm() in R for formula

北慕城南 提交于 2019-11-28 01:58:26
It is possible to use a shortcut for formula in lm() m <- matrix(rnorm(100), ncol=5) lm(m[,1] ~ m[,2:5] here it would be the same as lm(m[,1] ~ m[,2] + m[,3] + m[,4] + m[,5] but in the case when variables are not of the same level (at least this is my assumption for now) this does not work and I get the error: Error in model.frame.default(formula = hm[, 1] ~ hm[, 2:4], drop.unused.levels = TRUE) : invalid type (list) for variable 'hm[, 2:4]' Data (hm): N cor.distance switches time 1 50 0.04707842 2 0.003 2 100 -0.10769441 2 0.004 3 200 -0.01278359 2 0.004 4 300 0.04229509 5 0.008 5 500 -0

r predict function returning too many values [closed]

一个人想着一个人 提交于 2019-11-28 01:39:30
I've read other postings regarding named variables and tried implementing the answers but still get too many values for my new data that I want to run my existing model on. Here is working example code: set.seed(123) mydata <- data.frame("y"=rnorm(100,mean=0, sd = 1),"x"=c(1:100)) mylm <- lm(y ~ x, data=mydata) # ok so mylm is a model on 100 points - lets look at it and the data par(mfrow=c(2,2)) plot(mylm) par(mfrow=c(1,1)) predvals <- predict(mylm, data=mydata) plot(mydata$x,mydata$y) lines(predvals) No surprises here - a straight line through generated points - both 100 observations in

modify lm or loess function to use it within ggplot2's geom_smooth

南笙酒味 提交于 2019-11-28 00:11:10
问题 I need to modify the lm (or eventually loess ) function so I can use it in ggplot2's geom_smooth (or stat_smooth ). For example, this is how stat_smooth is used normally: > qplot(data=diamonds, carat, price, facets=~clarity) + stat_smooth(method='lm')` I would like to define a custom lm2 function to use as value for the method parameter in stat_smooth , so I can customize its behaviour. > lm2 <- function(formula, data, ...) { print(head(data)) return(lm(formula, data, ...)) } > qplot(data

lm function in R does not give coefficients for all factor levels in categorical data

若如初见. 提交于 2019-11-28 00:01:02
I was trying out linear regression with R using categorical attributes and observe that I don't get a coefficient value for each of the different factor levels I have. Please see my code below, I have 5 factor levels for states, but see only 4 values of co-efficients. > states = c("WA","TE","GE","LA","SF") > population = c(0.5,0.2,0.6,0.7,0.9) > df = data.frame(states,population) > df states population 1 WA 0.5 2 TE 0.2 3 GE 0.6 4 LA 0.7 5 SF 0.9 > states=NULL > population=NULL > lm(formula=population~states,data=df) Call: lm(formula = population ~ states, data = df) Coefficients: (Intercept)

Linear model function lm() error: NA/NaN/Inf in foreign function call (arg 1)

风格不统一 提交于 2019-11-27 23:29:13
Say I have data.frame a I use m.fit <- lm(col2 ~ col3 * col4, na.action = na.exclude) col2 has some NA values, col3 and col4 have values less than 1. I keep getting Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : NA/NaN/Inf in foreign function call (arg 1) I've checked the mailing list and it appears that it is because of the NA s in col2 but I tried using na.action=na.exclude/omit/pass but none of them seem to work. I've tested lm again on first 10 entries, definitely not because of the NA s. Problem with this warning is every google results seem to be pointing at NA

R: numeric 'envir' arg not of length one in predict()

一世执手 提交于 2019-11-27 21:00:43
I'm trying to predict a value in R using the predict() function, by passing along variables into the model. I am getting the following error: Error in eval(predvars, data, env) : numeric 'envir' arg not of length one Here is my data frame , name df: df <- read.table(text = ' Quarter Coupon Total 1 "Dec 06" 25027.072 132450574 2 "Dec 07" 76386.820 194154767 3 "Dec 08" 79622.147 221571135 4 "Dec 09" 74114.416 205880072 5 "Dec 10" 70993.058 188666980 6 "Jun 06" 12048.162 139137919 7 "Jun 07" 46889.369 165276325 8 "Jun 08" 84732.537 207074374 9 "Jun 09" 83240.084 221945162 10 "Jun 10" 81970.143

R: lm() result differs when using `weights` argument and when using manually reweighted data

眉间皱痕 提交于 2019-11-27 20:57:29
In order to correct heteroskedasticity in error terms, I am running the following weighted least squares regression in R : #Call: #lm(formula = a ~ q + q2 + b + c, data = mydata, weights = weighting) #Weighted Residuals: # Min 1Q Median 3Q Max #-1.83779 -0.33226 0.02011 0.25135 1.48516 #Coefficients: # Estimate Std. Error t value Pr(>|t|) #(Intercept) -3.939440 0.609991 -6.458 1.62e-09 *** #q 0.175019 0.070101 2.497 0.013696 * #q2 0.048790 0.005613 8.693 8.49e-15 *** #b 0.473891 0.134918 3.512 0.000598 *** #c 0.119551 0.125430 0.953 0.342167 #--- #Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0

Understanding lm and environment

江枫思渺然 提交于 2019-11-27 18:40:51
问题 I'm executing lm() with arguments formula , data , na.action , and weights . My weights are stored in a numeric variable. When I specify formula as a character (i.e. formula = "Response~0+." ), I get an error that weights is not of the proper length (even though it is). When I specify formula without the quotes (i.e. formula = Response~0+. ), the function works fine. I stumbled upon this sentence in the lm() documentation: All of weights, subset and offset are evaluated in the same way as

Using predict to find values of non-linear model

♀尐吖头ヾ 提交于 2019-11-27 18:37:35
问题 I'm trying the next code to try to see if predict can help me to find the values of the dependent variable for a polynomial of order 2, in this case it is obvious y=x^2: x <- c(1, 2, 3, 4, 5 , 6) y <- c(1, 4, 9, 16, 25, 36) mypol <- lm(y ~ poly(x, 2, raw=TRUE)) > mypol Call: lm(formula = y ~ poly(x, 2, raw = TRUE)) Coefficients: (Intercept) poly(x, 2, raw = TRUE)1 poly(x, 2, raw = TRUE)2 0 0 1 If I try to find the value of x=7, I get this: > predict(mypol, 7) Error in eval(predvars, data, env

Is there a faster lm function

折月煮酒 提交于 2019-11-27 17:22:40
问题 I would like to get the slope of a linear regression fit for 1M separate data sets (1M * 50 rows for data.frame, or 1M * 50 for array). Now I am using the lm() function, which takes a very long time (about 10 min). Is there any faster function for linear regression? 回答1: Yes there are: R itself has lm.fit() which is more bare-bones: no formula notation, much simpler result set several of our Rcpp-related packages have fastLm() implementations: RcppArmadillo, RcppEigen, RcppGSL. We have