coefficients

Calculating Standard Error of Coefficients for Logistic Regression in Spark

亡梦爱人 提交于 2019-12-22 18:19:01
问题 I know this question has been asked previously here. But I couldn't find the correct answer. The answer provided in the previous post suggests the usage of Statistics.chiSqTest(data) which provides the goodness of fit test (Pearson's Chi-Square tests), not the Wald Chi-Square tests for significance of coefficients. I was trying to build the parameter estimate table for logistic regression in Spark. I was able to get the coefficients and intercepts, but I couldn't find the spark API to get the

How can I get the relative importance of features of a logistic regression for a particular prediction?

放肆的年华 提交于 2019-12-21 05:11:27
问题 I am using a Logistic Regression (in scikit) for a binary classification problem, and am interested in being able to explain each individual prediction. To be more precise, I'm interested in predicting the probability of the positive class, and having a measure of the importance of each feature for that prediction. Using the coefficients (Betas) as a measure of importance is generally a bad idea as answered here, but I'm yet to find a good alternative. So far the best I have found are the

How to add linear model results (adj-r squared, slope and p-value) onto regression plot in r

感情迁移 提交于 2019-12-13 10:00:28
问题 Hi I have created a linear model and a regression plot - However, I would like to have the model results on the plot itself - something like the image below: How do I show the key results on the plot? Below is my code for the plot: library(ggplot2) ggplot(HP_crime15, aes (x = as.numeric(HP_crime15$Theft15), y = as.numeric(HP_crime15$X2015))) + geom_point(shape=1) + geom_smooth(method=lm) + xlab ("Recorded number of Thefts") + ylab("House prices (£)") + ggtitle("Title") 回答1: Ideally good

Obtaining regression coefficients from reduced major axis regression models using lmodel2 package

ぐ巨炮叔叔 提交于 2019-12-12 04:59:41
问题 I have a large data set with which I'm undertaking many regression analyses. I'm using a reduced major axis regression with r's lmodel2 package. What I need to do is extract the regression coefficients (r-squared, p-values, slope and intercept) from the RMA models. I can do this easily enough with the OLS regressions using: RSQ<-summary(model)$r.squared PVAL<-summary(model)$coefficients[2,4] INT<-summary(model)$coefficients[1,1] SLOPE<-summary(model)$coefficients[2,1] And then export them in

How to calcuate the energy spectrum of a signal?

时光毁灭记忆、已成空白 提交于 2019-12-12 02:02:55
问题 I know by theory that the energy spectrum of a given signal is the sum of the squared fourier coefficient . What if I have the real and imaginary part of the corresponding fourier coefficient, can I say that energy spectrum of a given signal is equal to sum of (real part + imaginary part)^2 I hope that is straightforward what Im trying to say?! best regards ben 回答1: Not quite. You want: sum of fft_result_magnitudes^2 which is: sum of (sqrt(real_part^2 + imaginary_part^2)^2 which is: sum of

Calculate and compare coefficient estimates from a regression interaction for each group

为君一笑 提交于 2019-12-11 06:37:19
问题 A) I am interested in the effects of a continuous variable ( Var1 ) on a continuous dependent variable ( DV ) conditional on four different groups, which are defined by two bivariate variables ( Dummy1 and Dummy2 ). I thus run a three-way interaction. Var1 <- sample(0:10, 100, replace = T) Dummy1 <- sample(c(0,1), 100, replace = T) Dummy2 <- sample(c(0,1), 100, replace = T) DV <-2*Var1 + Var1*Dummy1 + 2*Var1*Dummy2 + 10*Var1*Dummy1*Dummy2 + rnorm(100) fit <- lm(DV ~ Var1*Dummy1*Dummy2) I

scikit-learn LogisticRegressionCV: best coefficients

情到浓时终转凉″ 提交于 2019-12-11 05:09:40
问题 I am trying to understand how the best coefficients are calculated in a logistic regression cross-validation, where the "refit" parameter is True. If I understand the docs correctly, the best coefficients are the result of first determining the best regularization parameter "C", i.e., the value of C that has the highest average score over all folds. Then, the best coefficients are simply the coefficients that were calculated on the fold that has the highest score for the best C. I assume that

fit glm with known coefficients and unknown intercept

安稳与你 提交于 2019-12-11 02:59:06
问题 I am trying to fit a logistic regression model using glm , where I am only interested in the intercept - but I still want the model to be fitted with known coefficients. Example: or beta <- c(24.5,3.6,2.87,7.32) So I want to use model <- glm(y~x_1+x_2+x_3+x_4, family=binomial(link="logit"), data=dt) and in some way incorporate the known betas, so the glm function only fits the alpha. How can I do that? 回答1: With offsets, which add a known term to the linear predictor (RHS of the formula, on

Does R always return NA as a coefficient as a result of linear regression with unnecessary variables?

微笑、不失礼 提交于 2019-12-10 13:49:39
问题 My question is about the unnecessary predictors, namely the variables that do not provide any new linear information or the variables that are linear combinations of the other predictors. As you can see the swiss dataset has six variables. library(swiss) names(swiss) # "Fertility" "Agriculture" "Examination" "Education" # "Catholic" "Infant.Mortality" Now I introduce a new variable ec . It is the linear combination of Examination and Education . ec <- swiss$Examination + swiss$Catholic When

How can I get the relative importance of features of a logistic regression for a particular prediction?

醉酒当歌 提交于 2019-12-03 15:14:34
I am using a Logistic Regression (in scikit) for a binary classification problem, and am interested in being able to explain each individual prediction. To be more precise, I'm interested in predicting the probability of the positive class, and having a measure of the importance of each feature for that prediction. Using the coefficients (Betas) as a measure of importance is generally a bad idea as answered here , but I'm yet to find a good alternative. So far the best I have found are the following 3 options: Monte Carlo Option : Fixing all other features, re-run the prediction replacing the