confusion-matrix

caret train() predicts very different then predict.glm()

让人想犯罪 __ 提交于 2019-12-04 07:23:30
问题 I am trying to estimate a logistic regression, using the 10-fold cross-validation. #import libraries library(car); library(caret); library(e1071); library(verification) #data import and preparation data(Chile) chile <- na.omit(Chile) #remove "na's" chile <- chile[chile$vote == "Y" | chile$vote == "N" , ] #only "Y" and "N" required chile$vote <- factor(chile$vote) #required to remove unwanted levels chile$income <- factor(chile$income) # treat income as a factor Goal is to estimate a glm -

How can I analyze a confusion matrix?

大憨熊 提交于 2019-12-04 03:55:24
问题 When I print out scikit-learn's confusion matrix, I receive a very huge matrix. I want to analyze what are the true positives, true negatives etc. How can I do so? This is how my confusion matrix looks like. I wish to understand this better. [[4015 336 0 ..., 0 0 2] [ 228 2704 0 ..., 0 0 0] [ 4 7 19 ..., 0 0 0] ..., [ 3 2 0 ..., 5 0 0] [ 1 1 0 ..., 0 0 0] [ 13 1 0 ..., 0 0 11]] 回答1: IIUC, your question is undefined. "False positives", "true negatives" - these are terms that are defined only

R Confusion Matrix sensitivity and specificity labeling

ⅰ亾dé卋堺 提交于 2019-12-03 17:12:09
I am using R v3.3.2 and Caret 6.0.71 (i.e. latest versions) to construct a logistic regression classifier. I am using the confusionMatrix function to create stats for judging its performance. logRegConfMat <- confusionMatrix(logRegPrediction, valData[,"Seen"]) Reference 0, Prediction 0 = 30 Reference 1, Prediction 0 = 14 Reference 0, Prediction 1 = 60 Reference 1, Prediction 1 = 164 Accuracy : 0.7239 Sensitivity : 0.3333 Specificity : 0.9213 The target value in my data (Seen) uses 1 for true and 0 for false. I assume the Reference (Ground truth) columns and Predication (Classifier) rows in the

True Positive Rate and False Positive Rate (TPR, FPR) for Multi-Class Data in python

回眸只為那壹抹淺笑 提交于 2019-12-03 09:05:40
How do you compute the true- and false- positive rates of a multi-class classification problem? Say, y_true = [1, -1, 0, 0, 1, -1, 1, 0, -1, 0, 1, -1, 1, 0, 0, -1, 0] y_prediction = [-1, -1, 1, 0, 0, 0, 0, -1, 1, -1, 1, 1, 0, 0, 1, 1, -1] The confusion matrix is computed by metrics.confusion_matrix(y_true, y_prediction) , but that just shifts the problem. EDIT after @seralouk's answer. Here, the class -1 is to be considered as the negatives, while 0 and 1 are variations of positives. Using your data, you can get all the metrics for all the classes at once: import numpy as np from sklearn

caret train() predicts very different then predict.glm()

丶灬走出姿态 提交于 2019-12-02 13:55:19
I am trying to estimate a logistic regression, using the 10-fold cross-validation. #import libraries library(car); library(caret); library(e1071); library(verification) #data import and preparation data(Chile) chile <- na.omit(Chile) #remove "na's" chile <- chile[chile$vote == "Y" | chile$vote == "N" , ] #only "Y" and "N" required chile$vote <- factor(chile$vote) #required to remove unwanted levels chile$income <- factor(chile$income) # treat income as a factor Goal is to estimate a glm - model that predicts to outcome of vote "Y" or "N" depended on relevant explanatory variables and, based on

How to display confusion matrix and report (recall, precision, fmeasure) for each cross validation fold

泪湿孤枕 提交于 2019-12-02 12:13:17
问题 I am trying to perform 10 fold cross validation in python. I know how to calculate the confusion matrix and the report for split test(example split 80% training and 20% testing). But the problem is I don't know how to calculate the confusion matrix and report for each folds for example when fold-10, I just know code for average accuracy. 回答1: Here is a reproducible example with the breast cancer data and 3-fold CV for simplicity: from sklearn.datasets import load_breast_cancer from sklearn

Creating confusion matrix from multiple .csv files

吃可爱长大的小学妹 提交于 2019-12-02 10:41:37
I have a lot of .csv files with the following format. 338,800 338,550 339,670 340,600 327,500 301,430 299,350 284,339 284,338 283,335 283,330 283,310 282,310 282,300 282,300 283,290 From column 1, I wanted to read current row and compare it with the value of the previous row. If it is greater OR equal, continue comparing and if the value of the current cell is smaller than the previous row - then i divide the current value with the previous value and proceed. For example in the table given above: the smaller value we will get depending on my requirement from Column 1 is 327 (because 327 is

Multi-class multi-label confusion matrix with Sklearn

无人久伴 提交于 2019-12-02 08:56:48
问题 I am working with a multi-class multi-label output from my classifier. The total number of classes is 14 and instances can have multiple classes associated. For example: y_true = np.array([[0,0,1], [1,1,0],[0,1,0]) y_pred = np.array([[0,0,1], [1,0,1],[1,0,0]) The way I am making my confusion matrix right now: matrix = confusion_matrix(y_true.argmax(axis=1), y_pred.argmax(axis=1)) print(matrix) Which gives an output like: [[ 79 0 0 0 66 0 0 151 1 8 0 0 0 0] [ 4 0 0 0 11 0 0 27 0 0 0 0 0 0] [

How can I analyze a confusion matrix?

不问归期 提交于 2019-12-01 19:46:53
When I print out scikit-learn's confusion matrix, I receive a very huge matrix. I want to analyze what are the true positives, true negatives etc. How can I do so? This is how my confusion matrix looks like. I wish to understand this better. [[4015 336 0 ..., 0 0 2] [ 228 2704 0 ..., 0 0 0] [ 4 7 19 ..., 0 0 0] ..., [ 3 2 0 ..., 5 0 0] [ 1 1 0 ..., 0 0 0] [ 13 1 0 ..., 0 0 11]] IIUC, your question is undefined. "False positives", "true negatives" - these are terms that are defined only for binary classification. Read more about the definition of a confusion matrix . In this case, the confusion

Tensorflow confusion matrix using one-hot code

廉价感情. 提交于 2019-12-01 10:41:38
I have multi-class classification using RNN and here is my main code for RNN: def RNN(x, weights, biases): x = tf.unstack(x, input_size, 1) lstm_cell = rnn.BasicLSTMCell(num_unit, forget_bias=1.0, state_is_tuple=True) stacked_lstm = rnn.MultiRNNCell([lstm_cell]*lstm_size, state_is_tuple=True) outputs, states = tf.nn.static_rnn(stacked_lstm, x, dtype=tf.float32) return tf.matmul(outputs[-1], weights) + biases logits = RNN(X, weights, biases) prediction = tf.nn.softmax(logits) cost =tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=Y)) optimizer = tf.train