lm

R: lm() result differs when using `weights` argument and when using manually reweighted data

家住魔仙堡 提交于 2019-11-26 20:22:33
问题 In order to correct heteroskedasticity in error terms, I am running the following weighted least squares regression in R : #Call: #lm(formula = a ~ q + q2 + b + c, data = mydata, weights = weighting) #Weighted Residuals: # Min 1Q Median 3Q Max #-1.83779 -0.33226 0.02011 0.25135 1.48516 #Coefficients: # Estimate Std. Error t value Pr(>|t|) #(Intercept) -3.939440 0.609991 -6.458 1.62e-09 *** #q 0.175019 0.070101 2.497 0.013696 * #q2 0.048790 0.005613 8.693 8.49e-15 *** #b 0.473891 0.134918 3

Is there a fast estimation of simple regression (a regression line with only intercept and slope)?

我的梦境 提交于 2019-11-26 18:40:59
问题 This question relates to a machine learning feature selection procedure. I have a large matrix of features - columns are the features of the subjects (rows): set.seed(1) features.mat <- matrix(rnorm(10*100),ncol=100) colnames(features.mat) <- paste("F",1:100,sep="") rownames(features.mat) <- paste("S",1:10,sep="") The response was measured for each subject ( S ) under different conditions ( C ) and therefore looks like this: response.df <- data.frame(S = c(sapply(1:10, function(x) rep(paste(

lm function in R does not give coefficients for all factor levels in categorical data

梦想与她 提交于 2019-11-26 18:29:44
问题 I was trying out linear regression with R using categorical attributes and observe that I don't get a coefficient value for each of the different factor levels I have. Please see my code below, I have 5 factor levels for states, but see only 4 values of co-efficients. > states = c("WA","TE","GE","LA","SF") > population = c(0.5,0.2,0.6,0.7,0.9) > df = data.frame(states,population) > df states population 1 WA 0.5 2 TE 0.2 3 GE 0.6 4 LA 0.7 5 SF 0.9 > states=NULL > population=NULL > lm(formula

Finding where two linear fits intersect in R

可紊 提交于 2019-11-26 18:22:42
问题 I have two linear fits that I've gotten from lm calls in my R script. For instance... fit1 <- lm(y1 ~ x1) fit2 <- lm(y2 ~ x2) I'd like to find the (x,y) point at which these two lines ( fit1 and fit2 ) intersect, if they intersect at all. 回答1: One way to avoid the geometry is to re-parameterize the equations as: y1 = m1 * (x1 - x0) + y0 y2 = m2 * (x2 - x0) + y0 in terms of their intersection point (x0, y0) and then perform the fit of both at once using nls so that the returned values of x0

How `poly()` generates orthogonal polynomials? How to understand the “coefs” returned?

匆匆过客 提交于 2019-11-26 17:45:40
问题 My understanding of orthogonal polynomials is that they take the form y(x) = a1 + a2(x - c1) + a3(x - c2)(x - c3) + a4(x - c4)(x - c5)(x - c6)... up to the number of terms desired where a1 , a2 etc are coefficients to each orthogonal term (vary between fits), and c1 , c2 etc are coefficients within the orthogonal terms, determined such that the terms maintain orthogonality (consistent between fits using the same x values) I understand poly() is used to fit orthogonal polynomials. An example x

`lm` summary not display all factor levels

不问归期 提交于 2019-11-26 17:44:06
I am running a linear regression on a number of attributes including two categorical attributes, B and F , and I don't get a coefficient value for every factor level I have. B has 9 levels and F has 6 levels. When I initially ran the model (with intercepts), I got 8 coefficients for B and 5 for F which I understood as the first level of each being included in the intercept. I want ranking the levels within B and F based on their coefficient so I added -1 after each factor to lock the intercept at 0 so that I could get coefficients for all levels. Call: lm(formula = dependent ~ a + B-1 + c + d

Aligning Data frame with missing values

陌路散爱 提交于 2019-11-26 17:21:02
问题 I'm using a data frame with many NA values. While I'm able to create a linear model, I am subsequently unable to line the fitted values of the model up with the original data due to the missing values and lack of indicator column. Here's a reproducible example: library(MASS) dat <- Aids2 # Add NA's dat[floor(runif(100, min = 1, max = nrow(dat))),3] <- NA # Create a model model <- lm(death ~ diag + age, data = dat) # Different Values length(fitted.values(model)) # 2745 nrow(dat) # 2843 回答1:

Fast pairwise simple linear regression between variables in a data frame

▼魔方 西西 提交于 2019-11-26 16:41:10
I have seen pairwise or general paired simple linear regression many times on Stack Overflow. Here is a toy dataset for this kind of problem. set.seed(0) X <- matrix(runif(100), 100, 5, dimnames = list(1:100, LETTERS[1:5])) b <- c(1, 0.7, 1.3, 2.9, -2) dat <- X * b[col(X)] + matrix(rnorm(100 * 5, 0, 0.1), 100, 5) dat <- as.data.frame(dat) pairs(dat) So basically we want to compute 5 * 4 = 20 regression lines: ----- A ~ B A ~ C A ~ D A ~ E B ~ A ----- B ~ C B ~ D B ~ E C ~ A C ~ B ----- C ~ D C ~ E D ~ A D ~ B D ~ C ----- D ~ E E ~ A E ~ B E ~ C E ~ D ----- Here is a poor man's strategy: poor <

Pass a vector of variables into lm() formula

﹥>﹥吖頭↗ 提交于 2019-11-26 16:05:52
I was trying to automate a piece of my code so that programming become less tedious. Basically I was trying to do a stepwise selection of variables using fastbw() in the rms package. I would like to pass the list of variables selected by fastbw() into a formula as y ~ x1+x2+x3 , "x1" "x2" "x3" being the list of variables selected by fastbw() Here is the code I tried and did not work olsOAW0.r060 <- ols(roll_pct~byoy+trans_YoY+change18m, subset= helper=="POPNOAW0_r060", na.action = na.exclude, data = modelready) OAW0 <- fastbw(olsOAW0.r060, rule="p", type="residual", sls= 0.05) vec <- as.vector

Linear Regression with a known fixed intercept in R

北战南征 提交于 2019-11-26 15:20:37
问题 I want to calculate a linear regression using the lm() function in R. Additionally I want to get the slope of a regression, where I explicitly give the intercept to lm() . I found an example on the internet and I tried to read the R-help "?lm" (unfortunately I'm not able to understand it), but I did not succeed. Can anyone tell me where my mistake is? lin <- data.frame(x = c(0:6), y = c(0.3, 0.1, 0.9, 3.1, 5, 4.9, 6.2)) plot (lin$x, lin$y) regImp = lm(formula = lin$x ~ lin$y) abline(regImp,