logistic-regression

stratified splitting the data

一个人想着一个人 提交于 2020-02-26 08:24:33
问题 I have a large data set and like to fit different logistic regression for each City, one of the column in my data. The following 70/30 split works without considering City group. indexes <- sample(1:nrow(data), size = 0.7*nrow(data)) train <- data[indexes,] test <- data[-indexes,] But this does not guarantee the 70/30 split for each city. lets say that I have City A and City B, where City A has 100 rows, and City B has 900 rows, totaling 1000 rows. Splitting the data with above code will give

stratified splitting the data

我们两清 提交于 2020-02-26 08:23:45
问题 I have a large data set and like to fit different logistic regression for each City, one of the column in my data. The following 70/30 split works without considering City group. indexes <- sample(1:nrow(data), size = 0.7*nrow(data)) train <- data[indexes,] test <- data[-indexes,] But this does not guarantee the 70/30 split for each city. lets say that I have City A and City B, where City A has 100 rows, and City B has 900 rows, totaling 1000 rows. Splitting the data with above code will give

logistic regression predicts “NA” probability in R - why?

前提是你 提交于 2020-02-05 05:10:27
问题 I have run a logistic regression in R using the following code: logistic.train.model3 <- glm(josh.model2, family=binomial(link=logit), data=auth, na.action = na.exclude) print(summary(logistic.train.model3)) My response variable is binary, taking on values of 1 or 0. when I look at the summary, everything looks fine, every variable has a coefficient. However, when I try to output the predicted probabilities using the following code: auth$predict.train.logistic <- predict(logistic.train.model3

Cluster standard errors for ordered logit R polr - values deleted in estimation

此生再无相见时 提交于 2020-02-05 05:01:05
问题 I am quite new to R and used to pretty basic application. Now I have encountered a problem I need help with: I am looking for a way to cluster standard errors for an ordered logistic regression (my estimation is similar to this example) I already tried robcov and vcovCL and they give me similar error messages: Error in meatCL(x, cluster = cluster, type = type, ...) : number of observations in 'cluster' and 'estfun()' do not match Error in u[, ii] <- ui : number of items to replace is not a

Is it reasonable for l1/l2 regularization to cause all feature weights to be zero in vowpal wabbit?

|▌冷眼眸甩不掉的悲伤 提交于 2020-02-01 08:28:37
问题 I got a weird result from vw , which uses online learning scheme for logistic regression. And when I add --l1 or --l2 regularization then I got all predictions at 0.5 (that means all features are 0) Here's my command: vw -d training_data.txt --loss_function logistic -f model_l1 --invert_hash model_readable_l1 --l1 0.05 --link logistic ...and here's learning process info: using l1 regularization = 0.05 final_regressor = model_l1 Num weight bits = 18 learning rate = 0.5 initial_t = 0 power_t =

Stepwise regression error in R

安稳与你 提交于 2020-01-30 12:10:50
问题 I want to run a stepwise regression in R to choose the best fit model, my code is attached here: full.modelfixed <- glm(died_ed ~ age_1 + gender + race + insurance + injury + ais + blunt_pen + comorbid + iss +min_dist + pop_dens_new + age_mdn + male_pct + pop_wht_pct + pop_blk_pct + unemp_pct + pov_100x_npct + urban_pct, data = trauma, family = binomial (link = 'logit'), na.action = na.exclude) reduced.modelfixed <- stepAIC(full.modelfixed, direction = "backward") There is a error message

Stepwise regression error in R

纵饮孤独 提交于 2020-01-30 12:10:21
问题 I want to run a stepwise regression in R to choose the best fit model, my code is attached here: full.modelfixed <- glm(died_ed ~ age_1 + gender + race + insurance + injury + ais + blunt_pen + comorbid + iss +min_dist + pop_dens_new + age_mdn + male_pct + pop_wht_pct + pop_blk_pct + unemp_pct + pov_100x_npct + urban_pct, data = trauma, family = binomial (link = 'logit'), na.action = na.exclude) reduced.modelfixed <- stepAIC(full.modelfixed, direction = "backward") There is a error message

Goodness-of-fit for fixed effect logit model using 'bife' package

99封情书 提交于 2020-01-24 04:19:04
问题 I am using the 'bife' package to run the fixed effect logit model in R. However, I cannot compute any goodness-of-fit to measure the model's overall fit given the result I have below. I would appreciate if I can know how to measure the goodness-of-fit given this limited information. I prefer chi-square test but still cannot find a way to implement this either. --------------------------------------------------------------- Fixed effects logit model with analytical bias-correction Estimated

How to do logistic regression on summary data in R?

时光毁灭记忆、已成空白 提交于 2020-01-23 17:31:25
问题 So I have some data that is structured similarly to the following: | Works | DoesNotWork | ----------------------- Unmarried| 130 | 235 | Married | 10 | 95 | I'm trying to use logistic regression to predict Work Status from the Marriage Status , however I don't think I understand how to in R. For example, if my data looks like the following: MarriageStatus | WorkStatus| ----------------------------- Married | No | Married | No | Married | Yes | Unmarried | No | Unmarried | Yes | Unmarried |

confusing results with logistic regression in python

此生再无相见时 提交于 2020-01-23 17:27:29
问题 I'm doing logistic regression in Python with this example from wikipedia. link to example here's the code I have: from sklearn.linear_model import LogisticRegression lr = LogisticRegression() Z = [[0.5], [0.75], [1.0], [1.25], [1.5], [1.75], [1.75], [2.0], [2.25], [2.5], [2.75], [3.0], [3.25], [3.5], [4.0], [4.25], [4.5], [4.75], [5.0], [5.5]] # number of hours spent studying y = [0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1] # 0=failed, 1=pass lr.fit(Z,y) results for this are