lm | 易学教程

R data.table loop subset by factor and do lm()

阅读更多关于 R data.table loop subset by factor and do lm()

I am trying to create a function or even just work out how to run a loop using data.table syntax where I can subset the table by factor, in this case the id variable, then run a linear model on each subset and out the results. Sample data below. df <- data.frame(id = letters[1:3], cyl = sample(c("a","b","c"), 30, replace = TRUE), factor = sample(c(TRUE, FALSE), 30, replace = TRUE), hp = sample(c(20:50), 30, replace = TRUE)) dt=as.data.table(df) fit <- lm(hp ~ cyl + factor, data = df) #how do I get the [i] to work here to subset and iterate by each factor and also do it in data.table syntax?

Why do I get NA coefficients and how does `lm` drop reference level for interaction

阅读更多关于 Why do I get NA coefficients and how does `lm` drop reference level for interaction

I am trying to understand how R determines reference groups for interactions in a linear model. Consider the following: df <- structure(list(id = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L), .Label = c("1", "2", "3", "4", "5"), class = "factor"), year = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L), .Label = c("1", "2"), class = "factor"), treatment = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,

How does the subset argument work in the lm() function?

阅读更多关于 How does the subset argument work in the lm() function?

I have been trying to figure out how the subset argument in R's lm() function works. Especially the follwoing code seems dubious for me: data(mtcars) summary(lm(mpg ~ wt, data=mtcars)) summary(lm(mpg ~ wt, cyl, data=mtcars)) In every case the regression has 32 observations dim(lm(mpg ~ wt, cyl ,data=mtcars)$model) [1] 32 2 dim(lm(mpg ~ wt ,data=mtcars)$model) [1] 32 2 yet the coefficients change (along with the R²). The help doesn't provide too much information on this matter: subset an optional vector specifying a subset of observations to be used in the fitting process As a general principle

Plot conditional density curve `P(Y|X)` along a linear regression line

阅读更多关于 Plot conditional density curve `P(Y|X)` along a linear regression line

This is my data frame, with two columns Y (response) and X (covariate): ## Editor edit: use `dat` not `data` dat <- structure(list(Y = c(NA, -1.793, -0.642, 1.189, -0.823, -1.715, 1.623, 0.964, 0.395, -3.736, -0.47, 2.366, 0.634, -0.701, -1.692, 0.155, 2.502, -2.292, 1.967, -2.326, -1.476, 1.464, 1.45, -0.797, 1.27, 2.515, -0.765, 0.261, 0.423, 1.698, -2.734, 0.743, -2.39, 0.365, 2.981, -1.185, -0.57, 2.638, -1.046, 1.931, 4.583, -1.276, 1.075, 2.893, -1.602, 1.801, 2.405, -5.236, 2.214, 1.295, 1.438, -0.638, 0.716, 1.004, -1.328, -1.759, -1.315, 1.053, 1.958, -2.034, 2.936, -0.078, -0.676, -2

linear regression using lm() - surprised by the result

阅读更多关于 linear regression using lm() - surprised by the result

I used a linear regression on data I have, using the lm function. Everything works (no error message), but I'm somehow surprised by the result: I am under the impression R "misses" a group of points, i.e. the intercept and slope are not the best fit. For instance, I am referring to the group of points at coordinates x=15-25,y=0-20. My questions: is there a function to compare fit with "expected" coefficients and "lm-calculated" coefficients? have I made a silly mistake when coding, leading the lm to do that? Following some answers: additionnal information on x and y x and y are both visual

How does the subset argument work in the lm() function?

阅读更多关于 How does the subset argument work in the lm() function?

问题 I have been trying to figure out how the subset argument in R's lm() function works. Especially the follwoing code seems dubious for me: data(mtcars) summary(lm(mpg ~ wt, data=mtcars)) summary(lm(mpg ~ wt, cyl, data=mtcars)) In every case the regression has 32 observations dim(lm(mpg ~ wt, cyl ,data=mtcars)$model) [1] 32 2 dim(lm(mpg ~ wt ,data=mtcars)$model) [1] 32 2 yet the coefficients change (along with the R²). The help doesn't provide too much information on this matter: subset an

Plot conditional density curve `P(Y|X)` along a linear regression line

阅读更多关于 Plot conditional density curve `P(Y|X)` along a linear regression line

问题 This is my data frame, with two columns Y (response) and X (covariate): ## Editor edit: use `dat` not `data` dat <- structure(list(Y = c(NA, -1.793, -0.642, 1.189, -0.823, -1.715, 1.623, 0.964, 0.395, -3.736, -0.47, 2.366, 0.634, -0.701, -1.692, 0.155, 2.502, -2.292, 1.967, -2.326, -1.476, 1.464, 1.45, -0.797, 1.27, 2.515, -0.765, 0.261, 0.423, 1.698, -2.734, 0.743, -2.39, 0.365, 2.981, -1.185, -0.57, 2.638, -1.046, 1.931, 4.583, -1.276, 1.075, 2.893, -1.602, 1.801, 2.405, -5.236, 2.214, 1

Repeat the re-sampling function for 1000 times ? Using lapply?

阅读更多关于 Repeat the re-sampling function for 1000 times ? Using lapply?

问题 Please me out! I appreciate any helps ! Thanks! I have trouble on repeat doing re-sampling for 1000 times. I tried using replicate() to do that but it's not working. Is there any other method to do that? Can anyone show me if this maybe done by using lapply? Following is my code: #sampling 1000 betas0 & 1 (coefficients) from the data get.beta=function(data,indices){ data=data[indices,] #let boot to select sample lm.out=lm(y ~ x,data=data) return(lm.out$coefficients) } n=nrow(data) get.beta

Use of offset in lm regression - R

阅读更多关于 Use of offset in lm regression - R

I've this programme dens <- read.table('DensPiu.csv', header = FALSE) fl <- read.table('FluxPiu.csv', header = FALSE) mydata <- data.frame(c(dens),c(fl)) dat = subset(mydata, dens>=3.15) colnames(dat) <- c("x", "y") attach(dat) and I want to do a least-square regression on the data contained in dat , the function has the form y ~ a + b*x and I want the regression line to pass through a specific point P(x0,y0) (which is not the origin). I'm trying to do it like this x0 <- 3.15 y0 <-283.56 regression <- lm(y ~ I(x-x0)-1, offset=y0) (I think that data = dat is not necessary in this case) but I

Linear models in R with different combinations of variables

阅读更多关于 Linear models in R with different combinations of variables

I am new to R and I am stuck with a problem. I am trying to read a set of data in a table and I want to perform linear modeling. Below is how I read my data and my variables names: >data =read.table(datafilename,header=TRUE) >names(data) [1] "price" "model" "size" "year" "color" What I want to do is create several linear models using different combinations of the variables (price being the target ), such as: > attach(data) > model1 = lm(price~model+size) > model2 = lm(price~model+year) > model3 = lm(price~model+color) > model4 = lm(price~model+size) > model4 = lm(price~size+year+color) #...