lm | 易学教程

R error which says “Models were not all fitted to the same size of dataset”

阅读更多关于 R error which says “Models were not all fitted to the same size of dataset”

问题 I have created two generalised linear models as follows: glm1 <-glm(Y ~ X1 + X2 + X3, family=binomial(link=logit)) glm2 <-glm(Y ~ X1 + X2, family=binomial(link=logit)) I then use the anova function: anova(glm2,glm1) but get an error message: "Error in anova.glmlist(c(list(object),dotargs), dispersion = dispersion, : models were not all fitted to the same size of dataset" What does this mean and how can I fix this? I have attach ed the dataset at the start of my code so both models are working

Why the built-in lm function is so slow in R?

阅读更多关于 Why the built-in lm function is so slow in R?

问题 I always thought that the lm function was extremely fast in R, but as this example would suggest, the closed solution computed using the solve function is way faster. data<-data.frame(y=rnorm(1000),x1=rnorm(1000),x2=rnorm(1000)) X = cbind(1,data$x1,data$x2) library(microbenchmark) microbenchmark( solve(t(X) %*% X, t(X) %*% data$y), lm(y ~ .,data=data)) Can someone explain me if this toy example is a bad example or it is the case that lm is actually slow? EDIT: As suggested by Dirk

How does predict.lm() compute confidence interval and prediction interval?

阅读更多关于 How does predict.lm() compute confidence interval and prediction interval?

I ran a regression: CopierDataRegression <- lm(V1~V2, data=CopierData1) and my task was to obtain a 90% confidence interval for the mean response given V2=6 and 90% prediction interval when V2=6 . I used the following code: X6 <- data.frame(V2=6) predict(CopierDataRegression, X6, se.fit=TRUE, interval="confidence", level=0.90) predict(CopierDataRegression, X6, se.fit=TRUE, interval="prediction", level=0.90) and I got (87.3, 91.9) and (74.5, 104.8) which seems to be correct since the PI should be wider. The output for both also included se.fit = 1.39 which was the same. I don't understand what

How do I extract just the number from a named number (without the name)?

阅读更多关于 How do I extract just the number from a named number (without the name)?

问题 I am looking for just the value of the B1(newx) linear model coefficient, not the name. I just want the 0.5 value. I do not want the name \"newx\". newx <- c(0.5,1.5.2.5) newy <- c(2,3,4) out <- lm(newy ~ newx) out looks like: Call: lm(formula = newy ~ newx) Coefficients: (Intercept) newx 1.5 1.0 I arrived here. But now I am stuck. out$coefficients[\"newx\"] newx 1.0 回答1: For a single element like this, use [[ rather than [ . Compare: coefficients(out)["newx"] # newx # 1 coefficients(out)[[

Plot polynomial regression curve in R

阅读更多关于 Plot polynomial regression curve in R

问题 I have a simple polynomial regression which I do as follows attach(mtcars) fit <- lm(mpg ~ hp + I(hp^2)) Now, I plot as follows > plot(mpg~hp) > points(hp, fitted(fit), col=\'red\', pch=20) This gives me the following I want to connect these points into a smooth curve, using lines gives me the following > lines(hp, fitted(fit), col=\'red\', type=\'b\') What am I missing here. I want the output to be a smooth curve which connects the points 回答1: Try: lines(sort(hp), fitted(fit)[order(hp)], col

predict.lm() with an unknown factor level in test data

阅读更多关于 predict.lm() with an unknown factor level in test data

问题 I am fitting a model to factor data and predicting. If the newdata in predict.lm() contains a single factor level that is unknown to the model, all of predict.lm() fails and returns an error. Is there a good way to have predict.lm() return a prediction for those factor levels the model knows and NA for unknown factor levels, instead of only an error? Example code: foo <- data.frame(response=rnorm(3),predictor=as.factor(c(\"A\",\"B\",\"C\"))) model <- lm(response~predictor,foo) foo.new <- data

Why is using update on a lm inside a grouped data.table losing its model data?

阅读更多关于 Why is using update on a lm inside a grouped data.table losing its model data?

问题 Ok, this is a weird one. I suspect this is a bug inside data.table , but it would be useful if anyone can explain why this is happening - what is update doing exactly? I\'m using the list(list()) trick inside data.table to store fitted models. When you create a sequence of lm objects each for different groupings, and then update those models, the model data for all models becomes that of the last grouping. This seems like a reference is hanging around somewhere where a copy should have been

`lm` summary not display all factor levels

阅读更多关于 `lm` summary not display all factor levels

问题 I am running a linear regression on a number of attributes including two categorical attributes, B and F , and I don\'t get a coefficient value for every factor level I have. B has 9 levels and F has 6 levels. When I initially ran the model (with intercepts), I got 8 coefficients for B and 5 for F which I understood as the first level of each being included in the intercept. I want ranking the levels within B and F based on their coefficient so I added -1 after each factor to lock the

Fast pairwise simple linear regression between variables in a data frame

阅读更多关于 Fast pairwise simple linear regression between variables in a data frame

问题 I have seen pairwise or general paired simple linear regression many times on Stack Overflow. Here is a toy dataset for this kind of problem. set.seed(0) X <- matrix(runif(100), 100, 5, dimnames = list(1:100, LETTERS[1:5])) b <- c(1, 0.7, 1.3, 2.9, -2) dat <- X * b[col(X)] + matrix(rnorm(100 * 5, 0, 0.1), 100, 5) dat <- as.data.frame(dat) pairs(dat) So basically we want to compute 5 * 4 = 20 regression lines: ----- A ~ B A ~ C A ~ D A ~ E B ~ A ----- B ~ C B ~ D B ~ E C ~ A C ~ B ----- C ~ D

Extract regression coefficient values

阅读更多关于 Extract regression coefficient values

问题 I have a regression model for some time series data investigating drug utilisation. The purpose is to fit a spline to a time series and work out 95% CI etc. The model goes as follows: id <- ts(1:length(drug$Date)) a1 <- ts(drug$Rate) a2 <- lag(a1-1) tg <- ts.union(a1,id,a2) mg <-lm (a1~a2+bs(id,df=df1),data=tg) The summary output of mg is: Call: lm(formula = a1 ~ a2 + bs(id, df = df1), data = tg) Residuals: Min 1Q Median 3Q Max -0.31617 -0.11711 -0.02897 0.12330 0.40442 Coefficients: Estimate