lm

Is there a fast estimation of simple regression (a regression line with only intercept and slope)?

旧街凉风 提交于 2019-11-27 16:31:07
This question relates to a machine learning feature selection procedure. I have a large matrix of features - columns are the features of the subjects (rows): set.seed(1) features.mat <- matrix(rnorm(10*100),ncol=100) colnames(features.mat) <- paste("F",1:100,sep="") rownames(features.mat) <- paste("S",1:10,sep="") The response was measured for each subject ( S ) under different conditions ( C ) and therefore looks like this: response.df <- data.frame(S = c(sapply(1:10, function(x) rep(paste("S", x, sep = ""),100))), C = rep(paste("C", 1:100, sep = ""), 10), response = rnorm(1000),

Aligning Data frame with missing values

荒凉一梦 提交于 2019-11-27 15:49:10
I'm using a data frame with many NA values. While I'm able to create a linear model, I am subsequently unable to line the fitted values of the model up with the original data due to the missing values and lack of indicator column. Here's a reproducible example: library(MASS) dat <- Aids2 # Add NA's dat[floor(runif(100, min = 1, max = nrow(dat))),3] <- NA # Create a model model <- lm(death ~ diag + age, data = dat) # Different Values length(fitted.values(model)) # 2745 nrow(dat) # 2843 There are actually three solutions here: pad NA to fitted values ourselves; use predict() to compute fitted

How to interpret lm() coefficient estimates when using bs() function for splines

半世苍凉 提交于 2019-11-27 15:13:14
问题 I'm using a set of points which go from (-5,5) to (0,0) and (5,5) in a "symmetric V-shape". I'm fitting a model with lm() and the bs() function to fit a "V-shape" spline: lm(formula = y ~ bs(x, degree = 1, knots = c(0))) I get the "V-shape" when I predict outcomes by predict() and draw the prediction line. But when I look at the model estimates coef() , I see estimates that I don't expect. Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 4.93821 0.16117 30.639 1.40e-09 *** bs(x,

Finding where two linear fits intersect in R

时间秒杀一切 提交于 2019-11-27 14:11:04
I have two linear fits that I've gotten from lm calls in my R script. For instance... fit1 <- lm(y1 ~ x1) fit2 <- lm(y2 ~ x2) I'd like to find the (x,y) point at which these two lines ( fit1 and fit2 ) intersect, if they intersect at all. One way to avoid the geometry is to re-parameterize the equations as: y1 = m1 * (x1 - x0) + y0 y2 = m2 * (x2 - x0) + y0 in terms of their intersection point (x0, y0) and then perform the fit of both at once using nls so that the returned values of x0 and y0 give the result: # test data set.seed(123) x1 <- 1:10 y1 <- -5 + x1 + rnorm(10) x2 <- 1:10 y2 <- 5 - x1

Linear Regression with a known fixed intercept in R

孤者浪人 提交于 2019-11-27 10:57:09
I want to calculate a linear regression using the lm() function in R. Additionally I want to get the slope of a regression, where I explicitly give the intercept to lm() . I found an example on the internet and I tried to read the R-help "?lm" (unfortunately I'm not able to understand it), but I did not succeed. Can anyone tell me where my mistake is? lin <- data.frame(x = c(0:6), y = c(0.3, 0.1, 0.9, 3.1, 5, 4.9, 6.2)) plot (lin$x, lin$y) regImp = lm(formula = lin$x ~ lin$y) abline(regImp, col="blue") # Does not work: # Use 1 as intercept explicitIntercept = rep(1, length(lin$x)) regExp = lm

Adding lagged variables to an lm model?

我们两清 提交于 2019-11-27 10:34:38
问题 I'm using lm on a time series, which works quite well actually, and it's super super fast. Let's say my model is: > formula <- y ~ x I train this on a training set: > train <- data.frame( x = seq(1,3), y = c(2,1,4) ) > model <- lm( formula, train ) ... and I can make predictions for new data: > test <- data.frame( x = seq(4,6) ) > test$y <- predict( model, newdata = test ) > test x y 1 4 4.333333 2 5 5.333333 3 6 6.333333 This works super nicely, and it's really speedy. I want to add lagged

Messy plot when plotting predictions of a polynomial regression using lm() in R

自古美人都是妖i 提交于 2019-11-27 09:52:53
I am building a quadratic model with lm in R: y <- data[[1]] x <- data[[2]] x2 <- x^2 quadratic.model = lm(y ~ x + x2) Now I want to display both the predicted values and the actual values on a plot. I tried this: par(las=1,bty="l") plot(y~x) P <- predict(quadratic.model) lines(x, P) but the line comes up all squiggely. Maybe it has to do with the fact that it's quadratic? Thanks for any help. 李哲源 You need order() : P <- predict(quadratic.model) plot(y~x) reorder <- order(x) lines(x[reorder], P[reorder]) My answer here is related: Problems displaying LOESS regression line and confidence

Showing string in formula and not as variable in lm fit

久未见 提交于 2019-11-27 09:49:16
I am not able to resolve the issue that when lm(sformula) is executed, it does not show the string that is assigned to sformula . I have a feeling it is generic way R handles argument of a function and not specific to linear regression. Below is the illustration of the issue through examples. Example 1, has the undesired output lm(formula = sformula) . The example 2 is the output I would like i.e., lm(formula = "y~x") . x <- 1:10 y <- x * runif(10) sformula <- "y~x" ## Example: 1 lm(sformula) ## Call: ## lm(formula = sformula) ## Example: 2 lm("y~x") ## Call: ## lm(formula = "y~x") How about

R error which says “Models were not all fitted to the same size of dataset”

随声附和 提交于 2019-11-27 09:16:33
I have created two generalised linear models as follows: glm1 <-glm(Y ~ X1 + X2 + X3, family=binomial(link=logit)) glm2 <-glm(Y ~ X1 + X2, family=binomial(link=logit)) I then use the anova function: anova(glm2,glm1) but get an error message: "Error in anova.glmlist(c(list(object),dotargs), dispersion = dispersion, : models were not all fitted to the same size of dataset" What does this mean and how can I fix this? I have attach ed the dataset at the start of my code so both models are working off of the same dataset. The main cause of that error is when there are missing values in one or more

predict x values from simple fitting and annoting it in the plot

佐手、 提交于 2019-11-27 08:38:51
问题 I have a very simple question but so far couldn't find easy solution for that. Let's say I have a some data that I want to fit and show its x axis value where y is in particular value. In this case let's say when y=0 what is the x value. Model is very simple y~x for fitting but I don't know how to estimate x value from there. Anyway, sample data library(ggplot2) library(scales) df = data.frame(x= sort(10^runif(8,-6,1),decreasing=TRUE), y = seq(-4,4,length.out = 8)) ggplot(df, aes(x = x, y = y