lm | 易学教程

decreasing coefficients in R's coefplot?

阅读更多关于 decreasing coefficients in R's coefplot?

问题 coefplot from library(coefplot) has a variable decreasing which when set to to TRUE the coefficients should be plotted in descending order But when I run a toy example: data(tips, package = "reshape2") mod1 <- lm(tip ~ day + sex + smoker, data = tips) coefplot.glm(mod2, decreasing = TRUE) the coefficients aren't in descending order. What am I missing? EDIT I was missing sort = "magnitude" . However, this doesn't work with multiplot : data(tips, package = "reshape2") mod1 <- lm(tip ~ day + sex

Checking Type III ANOVA results [duplicate]

阅读更多关于 Checking Type III ANOVA results [duplicate]

问题 This question already has answers here : How to use formula in R to exclude main effect but retain interaction (3 answers) Why do I get NA coefficients and how does `lm` drop reference level for interaction (1 answer) Closed 2 years ago . Setting aside the debate about Type III ANOVA and the Principle of Marginality and all that... I've set up two models whose sum of squares should be different (and Type III ANOVA would test that difference). Here's the code: library(car) library(openintro)

Prediction using Fixed Effects

阅读更多关于 Prediction using Fixed Effects

问题 I have a simple data set for which I applied a simple linear regression model. Now I would like to use fixed effects to make a better prediction on the model. I know that I could also consider making dummy variables, but my real dataset consist of more years and has more variables so I would like to avoid making dummies. My data and code is similar to this: data <- read.table(header = TRUE, stringsAsFactors = FALSE, text="CompanyNumber ResponseVariable Year ExplanatoryVariable1

regarding handling many binary independent variables in lm

阅读更多关于 regarding handling many binary independent variables in lm

问题 When building the linear regression model using lm , the data set has about 20 independent variables. Do I need to explicitly clarify them as factor ? If I have to, how can I do that? It can be very tedious to declare one by one. 回答1: First, check which variables R has automatically converted into factors with the commande str(mydata) Then if you want to convert several variable into factors easily, you can do something like this: create a "mycol" variable with the No of columns you want to

R: How to or should I drop an insignificant orthogonal polynomial basis in a linear model?

阅读更多关于 R: How to or should I drop an insignificant orthogonal polynomial basis in a linear model?

问题 I have soil moisture data with x-, y- and z-coordinates like this: gue <- structure(list(x = c(311939.1507, 311935.4607, 311924.7316, 311959.553, 311973.5368, 311953.3743, 311957.9409, 311948.3151, 311946.7169, 311997.0803, 312017.5236, 312006.0245, 312001.5179, 311992.7044, 311977.3076, 311960.4159, 311970.6047, 311957.2564, 311866.4246, 311870.8714, 311861.4461, 311928.7096, 311929.6291, 311929.4233, 311891.2915, 311890.3429, 311900.8905, 311864.4995, 311870.8143, 311866.9257, 312002.571,

Get p values for a specific variable in many models with all possible combinations of other independent variables

阅读更多关于 Get p values for a specific variable in many models with all possible combinations of other independent variables

问题 I am trying to run many regression models with all possible combinations of a set of independent variables. In this example, I am interested in the coefficients of cyl with all possible combinations of other variables listed in xlist . df <- mtcars md <- "mpg ~ cyl" xlist <- c("disp", "hp", "am") n <- length(xlist) # get a list of all possible combinations of xlist comb_lst <- unlist(lapply(1:n, function(x) combn(xlist, x, simplify=F)), recursive = F) # get a list of all models md_lst <-

rstudent() returns incorrect result for an “mlm” (linear models fitted with multiple LHS)

阅读更多关于 rstudent() returns incorrect result for an “mlm” (linear models fitted with multiple LHS)

问题 I know that the support for linear models with multiple LHS is limited. But when it is possible to run a function on an "mlm" object, I would expect the results to be trusty. When using rstudent , strange results are produced. Is this a bug or is there some other explanation? In the example below fittedA and fittedB are identical, but in the case of rstudent the 2nd column differs. y <- matrix(rnorm(20), 10, 2) x <- 1:10 fittedA <- fitted(lm(y ~ x)) fittedB <- cbind(fitted(lm(y[, 1] ~ x)),

Ordinary least squares with glmnet and lm

阅读更多关于 Ordinary least squares with glmnet and lm

问题 This question was asked in stackoverflow.com/q/38378118 but there was no satisfactory answer. LASSO with λ = 0 is equivalent to ordinary least squares, but this does not seem to be the case for glmnet() and lm() in R. Why? library(glmnet) options(scipen = 999) X = model.matrix(mpg ~ 0 + ., data = mtcars) y = as.matrix(mtcars["mpg"]) coef(glmnet(X, y, lambda = 0)) lm(y ~ X) Their regression coefficients agree by at most 2 significant figures, perhaps due to slightly different termination

biglm predict unable to allocate a vector of size xx.x MB

阅读更多关于 biglm predict unable to allocate a vector of size xx.x MB

问题 I have this code: library(biglm) library(ff) myData <- read.csv.ffdf(file = "myFile.csv") testData <- read.csv(file = "test.csv") form <- dependent ~ . model <- biglm(form, data=myData) predictedData <- predict(model, newdata=testData) the model is created without problems, but when I make the prediction... it runs out of memory: unable to allocate a vector of size xx.x MB some hints? or how to use ff to reserve memory for predictedData variable? 回答1: I have not used biglm package before.

How can I extract the number of lines and the corresponding equations from a linear fit

阅读更多关于 How can I extract the number of lines and the corresponding equations from a linear fit

问题 I have data and I expect several linear correlations of the form y_i = a_i + b_i * t_i, i = 1 .. N where N is a priori unknown. The short version of the question is: Given a fit how can I extract N ? how can I extract the equations? In the reproducible example below, I have data (t,y) with corresponding parameters p1 (levels p1_1 , p1_2 ) and p2 (levels p2_1 , p2_2 , p2_3 ). Thus the data looks like (t, y, p1, p2) which has at most 2*3 different best-fit lines and the linear fit from has then