lm | 易学教程

R: build separate models for each category

阅读更多关于 R: build separate models for each category

问题 Short version : How to build separate models for each category (without splitting the data). (I am new to R) Long version: consider the following synthetic data housetype,ht1,ht2,age,price O,0,1,1,1000 O,0,1,2,2000 O,0,1,3,3000 N,1,0,1,10000 N,1,0,2,20000 N,1,0,3,30000 We can model the above using two separate models if(housetype=='o') price = 1000 * age else price = 10000 * age i.e. a separate model based on category type? This is what I have tried model=lm(price~housetype+age, data=datavar)

Constrained linear regression coefficients in R [duplicate]

阅读更多关于 Constrained linear regression coefficients in R [duplicate]

问题 This question already has an answer here : R : constraining coefficients and error variance over multiple subsample regressions [closed] (1 answer) Closed 3 years ago . I'm estimating several ordinary least squares linear regressions in R. I want to constrain the estimated coefficients across the regressions such that they're the same. For example, I have the following: z1 ~ x + y z2 ~ x + y And I would like the estimated coefficient on y in the first regression to be equal to the estimated

split on factor, sapply, and lm [duplicate]

阅读更多关于 split on factor, sapply, and lm [duplicate]

问题 This question already has answers here : Linear Regression and group by in R (10 answers) Closed 3 years ago . I want to apply lm() to observations grouped by subject, but cannot work out the sapply syntax. At the end, I want a dataframe with 1 row for each subject, and the intercept and slope (ie, rows of: subj, lm$coefficients[1] lm$coefficients[2]) set.seed(1) subj <- rep(c("a","b","c"), 4) # 4 observations each on 3 experimental subjects ind <- rnorm(12) #12 random numbers, the

linear regression in R without copying data in memory?

阅读更多关于 linear regression in R without copying data in memory?

问题 The standard way of doing a linear regression is something like this: l <- lm(Sepal.Width ~ Petal.Length + Petal.Width, data=iris) and then use predict(l, new_data) to make predictions, where new_data is a dataframe with columns matching the formula. But lm() returns an lm object, which is a list that contains crap-loads of stuff that is mostly irrelevant in most situations. This includes a copy of the original data, and a bunch of named vectors and arrays the length/size of the data: R> str

linear regression in R without copying data in memory?

阅读更多关于 linear regression in R without copying data in memory?

Novice needs to loop lm in R

阅读更多关于 Novice needs to loop lm in R

问题 I'm a PhD student of genetics and I am trying do association analysis of some genetic data using linear regression. In the table below I'm regressing each 'trait' against each 'SNP' There is also a interaction term include as 'var' I've only used R for 2 weeks and I don't have any programming background so please explain any help provided as I want to understand. This is a sample of my data: Sample ID var trait 1 trait 2 trait 3 SNP1 SNP2 SNP3 77856517 2 188 3 2 1 0 0 375689755 8 17 -1 -1 1

QR decomposition different in lm and biglm?

阅读更多关于 QR decomposition different in lm and biglm?

问题 I'm trying to recover the R matrix from the QR decomposition used in biglm. For this I am using a portion of the code in vcov.biglm and put it into a function like so: qr.R.biglm <- function (object, ...) { # Return the qr.R matrix from a biglm object object$qr <- .Call("singcheckQR", object$qr) p <- length(object$qr$D) R <- diag(p) R[row(R) > col(R)] <- object$qr$rbar R <- t(R) R <- sqrt(object$qr$D) * R dimnames(R) <- list(object$names, object$names) return(R) } More specifically, I'm

Rolling regression and prediction with lm() and predict()

阅读更多关于 Rolling regression and prediction with lm() and predict()

问题 I need to apply lm() to an enlarging subset of my dataframe dat , while making prediction for the next observation. For example, I am doing: fit model predict ---------- ------- dat[1:3, ] dat[4, ] dat[1:4, ] dat[5, ] . . . . dat[-1, ] dat[nrow(dat), ] I know what I should do for a particular subset (related to this question: predict() and newdata - How does this work?). For example to predict the last row, I do dat1 = dat[1:(nrow(dat)-1), ] dat2 = dat[nrow(dat), ] fit = lm(log(clicks) ~ log

plot.lm Error: $ operator is invalid for atomic vectors

阅读更多关于 plot.lm Error: $ operator is invalid for atomic vectors

问题 I have the following regression model with transformations: fit <- lm( I(NewValue ^ (1 / 3)) ~ I(CurrentValue ^ (1 / 3)) + Age + Type - 1, data = dataReg) plot(fit) But plot gives me the following error: Error: $ operator is invalid for atomic vectors Any ideas about what I am doing wrong? Note : summary , predict , and residuals all work correctly. 回答1: This is actually quite a interesting observation. In fact, among all 6 plots supported by plot.lm , only the Q-Q plot fails in this case.

R data.table loop subset by factor and do lm()

阅读更多关于 R data.table loop subset by factor and do lm()

问题 I am trying to create a function or even just work out how to run a loop using data.table syntax where I can subset the table by factor, in this case the id variable, then run a linear model on each subset and out the results. Sample data below. df <- data.frame(id = letters[1:3], cyl = sample(c("a","b","c"), 30, replace = TRUE), factor = sample(c(TRUE, FALSE), 30, replace = TRUE), hp = sample(c(20:50), 30, replace = TRUE)) dt=as.data.table(df) fit <- lm(hp ~ cyl + factor, data = df) #how do