glm | 易学教程

For glm2 package trying to convert factor to numeric and preserving it as a dataframe

阅读更多关于 For glm2 package trying to convert factor to numeric and preserving it as a dataframe

问题 Use package glm2 and mlbench in R, I am testing with the BreastCancer dataset. I started with 10 columns in the data frame with 479 observations. I want to convert 9 out of 10 columns from factor to numeric and preserve the data frame but my new data frame became a data frame of 2 columns instead of the 10 ? Here is my code library(mlbench) library(glm2) data(BreastCancer) BC = na.omit(BreastCancer) BC = BC[, -1] indexes = sample(1:nrow(BC), size=0.3*nrow(BC)) BCtrain = BC[-indexes,] as

data error in BMA package's bic.glm but not glm

阅读更多关于 data error in BMA package's bic.glm but not glm

问题 I am estimating a poisson model from a set of interaction coefficients, and the BMA package's bic.glm helps navigate the model space. I've been using it for years, but when I updated R from 2.10.x to 2.14.2 last night, it stopped working. Here's the error: first, a call that works: > glm(formula(Y~.), data=XY5, family=poisson) Call: glm(formula = formula(Y ~ .), family = poisson, data = XY5) Coefficients: <results, etc> Now bic.glm failing: > bic.glm(formula(Y~.), data=XY5, glm.family=poisson

Extract Residual Deviance from anova (glm) in R

阅读更多关于 Extract Residual Deviance from anova (glm) in R

问题 I fitted a glm model in R and took the anova table. I need to extract the "Residual Deviance" column. But it generates an error. Here are the codes: Creating data: counts <- c(18,17,15,20,10,20,25,13,12) outcome <- gl(3,1,9) treatment <- gl(3,3) Fitting GLM: glm.D93 <- glm(counts ~ outcome + treatment, family = quasipoisson(link = "log")) Anova table: av.1=anova(glm.D93) av.1 Analysis of Deviance Table Model: quasipoisson, link: log Response: counts Terms added sequentially (first to last) Df

How to retrieve correlation matrix from glm models in R

阅读更多关于 How to retrieve correlation matrix from glm models in R

问题 I am using the gls function from the nlme package. You can copy and paste the following code to reproduce my analysis. library(nlme) # Needed for gls function # Read in wide format tlc = read.table("http://www.hsph.harvard.edu/fitzmaur/ala2e/tlc.dat",header=FALSE) names(tlc) = c("id","trt","y0","y1","y4","y6") tlc$trt = factor(tlc$trt, levels=c("P","A"), labels=c("Placebo","Succimer")) # Convert to long format tlc.long = reshape(tlc, idvar="id", varying=c("y0","y1","y4","y6"), v.names="y",

microsimulation GLM including stochastic part

阅读更多关于 microsimulation GLM including stochastic part

问题 I'm trying to simulate GLM functions in R including stochastic uncertainty. I compared a formula-based approach to the R-based simulate() function and get different results. Not sure what I (probably its me and not R) am doing wrong. I start by creating a simulation cohort: set.seed(1) library(MASS) d <- mvrnorm(n=3000, mu=c(30,12,60), Sigma=matrix(data=c(45, 5, 40, 5, 15, 13, 40, 13, 300), nrow=3)) d[,1] <- d[,1]^2 Fit the model: m <- glm(formula=d[,1]~d[,2] + d[,3], family=gaussian(link=

Calculate cross validation for Generalized Linear Model in Matlab

阅读更多关于 Calculate cross validation for Generalized Linear Model in Matlab

问题 I am doing a regression using Generalized Linear Model.I am caught offguard using the crossVal function. My implementation so far; x = 'Some dataset, containing the input and the output' X = x(:,1:7); Y = x(:,8); cvpart = cvpartition(Y,'holdout',0.3); Xtrain = X(training(cvpart),:); Ytrain = Y(training(cvpart),:); Xtest = X(test(cvpart),:); Ytest = Y(test(cvpart),:); mdl = GeneralizedLinearModel.fit(Xtrain,Ytrain,'linear','distr','poisson'); Ypred = predict(mdl,Xtest); res = (Ypred - Ytest);

Fractional Response Regression in R

阅读更多关于 Fractional Response Regression in R

问题 I am trying to model my data in which the response variable is between 0 and 1, so I have decided to use fractional response model in R. From my current understanding, the fractional response model is similar to logistic regression, but it uses qausi-likelihood method to determine parameters. I am not sure I understand it correctly. So far what I have tried is the frm from package frm and glm on the following data, which is the same as this OP library(foreign) mydata <- read.dta("k401.dta")

LC50 / LD50 confidence intervals from multiple regression glm with interaction

阅读更多关于 LC50 / LD50 confidence intervals from multiple regression glm with interaction

问题 I have a quasibinomial glm with two continuous explanatory variables (let's say "LogPesticide" and "LogFood") and an interaction. I would like to calculate the LC50 of the pesticide with confidence intervals at different amounts of food (e. g. the minimum and maximum food value). How can this be achieved? Example: First I generate a data set. mydata <- data.frame( LogPesticide = rep(log(c(0, 0.1, 0.2, 0.4, 0.8, 1.6) + 0.05), 4), LogFood = rep(log(c(1, 2, 4, 8)), each = 6) ) set.seed(seed=16)

How to interpret the probabilities (p0, p1) of the result of h2o.predict()

阅读更多关于 How to interpret the probabilities (p0, p1) of the result of h2o.predict()

问题 I would like to understand the meaning of the value (result) of h2o.predict() function from H2o R-package. I realized that in some cases when the predict column is 1 , the p1 column has a lower value than the column p0 . My interpretation of p0 and p1 columns refer to the probabilities for each event, so I expected when predict=1 the probability of p1 should be higher than the probability of the opposite event ( p0 ), but it doesn't occur always as I can show in the following example: using

Is there any way to fit a `glm()` so that all levels are included (i.e. no reference level)?

阅读更多关于 Is there any way to fit a `glm()` so that all levels are included (i.e. no reference level)?

问题 Consider the code: x <- read.table("http://data.princeton.edu/wws509/datasets/cuse.dat", header=TRUE)[,1:2] fit <- glm(education ~ age, family="binomial", data=x) summary(fit) Where age has 4 levels: "<25" "25-29" "30-39" "40-49" The results are: So by default, one of the levels is used as a reference level. Is there a way to have glm output coefficients for all 4 levels + the intercept (i.e. have no reference level)? Software packages like SAS do this by default, so I was wondering if there