lm | 易学教程

Fitting a function in R

阅读更多关于 Fitting a function in R

问题 I have a few datapoints (x and y) that seem to have a logarithmic relationship. > mydata x y 1 0 123 2 2 116 3 4 113 4 15 100 5 48 87 6 75 84 7 122 77 > qplot(x, y, data=mydata, geom="line") Now I would like to find an underlying function that fits the graph and allows me to infer other datapoints (i.e. 3 or 82 ). I read about lm and nls but I'm not getting anywhere really. At first, I created a function of which I thought it resembled the plot the most: f <- function(x, a, b) { a * exp(b *-x

Fast group-by simple linear regression

阅读更多关于 Fast group-by simple linear regression

This Q & A arises from How to make group_by and lm fast? where OP was trying to do a simple linear regression per group for a large data frame. In theory, a series of group-by regression y ~ x | g is equivalent to a single pooled regression y ~ x * g . The latter is very appealing because statistical test between different groups is straightforward. But in practice doing this larger regression is not computationally easy. My answer on the linked Q & A reviews packages speedlm and glm4 , but pointed out that they can't well address this problem. Large regression problem is difficult,

Cluster-Robust Standard Errors in Stargazer

阅读更多关于 Cluster-Robust Standard Errors in Stargazer

Does anyone know how to get stargazer to display clustered SEs for lm models? (And the corresponding F-test?) If possible, I'd like to follow an approach similar to computing heteroskedasticity-robust SEs with sandwich and popping them into stargazer as in http://jakeruss.com/cheatsheets/stargazer.html#robust-standard-errors-replicating-statas-robust-option . I'm using lm to get my regression models, and I'm clustering by firm (a factor variable that I'm not including in the regression models). I also have a bunch of NA values, which makes me think multiwayvcov is going to be the best package

Piecewise regression with a straight line and a horizontal line joining at a break point

阅读更多关于 Piecewise regression with a straight line and a horizontal line joining at a break point

I want to do a piecewise linear regression with one break point, where the 2nd half of the regression line has slope = 0 . There are examples of how to do a piecewise linear regression, such as here . The problem I'm having is I'm not clear how to fix the slope of half of the model to be 0. I tried lhs <- function(x) ifelse(x < k, k-x, 0) rhs <- function(x) ifelse(x < k, 0, x-k) fit <- lm(y ~ lhs(x) + rhs(x)) where k is the break point, but the segment on the right is not a flat / horizontal one. I want to constrain the slope of the second segment at 0. I tried: fit <- lm(y ~ x * (x < k) + x *

How to add all variables its second degree in lm()? [duplicate]

阅读更多关于 How to add all variables its second degree in lm()? [duplicate]

问题 This question already has an answer here : R:fit dynamic number of explanatory variable into polynomial regression (1 answer) Closed 3 years ago . I have a dataframe with 16 variables. When I do multiple linear regression I do the following: fit <- lm(y ~ .,data=data) Now, I know how to add a second degree term of one of the variables: fit2 <- lm(y ~ poly(x1,2) + .,data=data) But now I don't want to write this out for all of my 16 variables. How can I do this in an easy way for all my

warning in lm prediction for r [duplicate]

阅读更多关于 warning in lm prediction for r [duplicate]

问题 This question already has answers here : Getting Warning: “ 'newdata' had 1 row but variables found have 32 rows” on predict.lm (4 answers) Closed 3 years ago . collection <- data.frame(col1=X1,col2=X2,col3=X3,col4=X4) k <- 5 ind <- sample(seq(1,k), length(X1), replace=TRUE) test_ind = which(ind==1) train<-collection[-test_ind,] fit<-lm(X1~poly(X2,2,raw=T)+X3+X4+X2:X3,data=train) model1_resid<-predict(fit,collection[test_ind,2:4]) Warning message: 'newdata' had 105 rows but variables found

Solving normal equation gives different coefficients from using `lm`?

阅读更多关于 Solving normal equation gives different coefficients from using `lm`?

问题 I wanted to compute a simple regression using the lm and plain matrix algebra. However, my regression coefficients obtained from matrix algebra are only half of those obtained from using the lm and I have no clue why. Here's the code boot_example <- data.frame( x1 = c(1L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L), x2 = c(0L, 0L, 0L, 1L, 1L, 1L, 0L, 0L, 0L), x3 = c(1L, 0L, 0L, 1L, 0L, 0L, 1L, 0L, 0L), x4 = c(0L, 1L, 0L, 0L, 1L, 0L, 0L, 1L, 0L), x5 = c(1L, 0L, 0L, 0L, 0L, 1L, 0L, 1L, 0L), x6 = c(0L, 1L,

Exporting R lm regression with cbind to text

阅读更多关于 Exporting R lm regression with cbind to text

问题 I have the following lm with a vector of dependent variables: > fit<-lm(cbind(X1m, X3m, X6m, X1y, X2y, X3y, X5y, X6y, X10y, X20y, X30y) ~ (ff + dc), data = yields) When attempting to export the entire output to csv, I get this error: write.csv(as.data.frame(summary(fit)), file="regression1.csv") Error in as.data.frame.default(summary(fit)) : cannot coerce class ""listof"" to a data.frame If I export just coefficients, everything works fine: write.csv(as.data.frame(coef(fit)), file=

R: Multiple Linear Regression with a specific range of variables [duplicate]

阅读更多关于 R: Multiple Linear Regression with a specific range of variables [duplicate]

问题 This question already has answers here : short formula call for many variables when building a model [duplicate] (2 answers) Closed 3 years ago . It appears simple, but I don't know how to code it in R. I have a dataframe (df) with ~100 variables, and I would like to do a multiple regression between the response which is my First variable (Y) and the variables 25 to 60 as regressors. The problem is that I don't want to write each variable name like: lm(Y~var25+var26+.......var60, data=df) I

Add regression line (and goodness-of-fit stats) to scatterplot

阅读更多关于 Add regression line (and goodness-of-fit stats) to scatterplot

问题 After reviewing other stackoverflow posts, I am attempting to add a regression line to my scatter plot with: plot(subdata2$PeakToGone, subdata2$NO3_AVG, xlim = c(0, 70)) abline(lm(PeakToGone~NO3_AVG, data = subdata2)) However, it is not showing the line. I would also like to add the R^2, RMSE, and p-value from lm as text on the plot. How can I add the regression line to the plot, along with these goodness-of-fit stats? 回答1: By default, plot regards the 1st param as x and the 2nd as y . Try