My problem:
I have a dataset which is a large JSON file. I read it and store it in the trainList
variable.
Next, I pre-process
I have tried some of the answers and found them not working.
This works for me:
from sklearn.metrics import classification_report
print(classification_report(y_test, predicted))
If you have two lists that have the predicted and actual values; as it appears you do, you can pass them to a function that will calculate TP, FP, TN, FN with something like this:
def perf_measure(y_actual, y_hat):
TP = 0
FP = 0
TN = 0
FN = 0
for i in range(len(y_hat)):
if y_actual[i]==y_hat[i]==1:
TP += 1
if y_hat[i]==1 and y_actual[i]!=y_hat[i]:
FP += 1
if y_actual[i]==y_hat[i]==0:
TN += 1
if y_hat[i]==0 and y_actual[i]!=y_hat[i]:
FN += 1
return(TP, FP, TN, FN)
From here I think you will be able to calculate rates of interest to you, and other performance measure like specificity and sensitivity.
According to scikit-learn documentation,
http://scikit-learn.org/stable/modules/generated/sklearn.metrics.confusion_matrix.html#sklearn.metrics.confusion_matrix
By definition a confusion matrix C is such that C[i, j]
is equal to the number of observations known to be in group i
but predicted to be in group j
.
Thus in binary classification, the count of true negatives is C[0,0]
, false negatives is C[1,0]
, true positives is C[1,1]
and false positives is C[0,1]
.
CM = confusion_matrix(y_true, y_pred)
TN = CM[0][0]
FN = CM[1][0]
TP = CM[1][1]
FP = CM[0][1]
#False positive cases
train = pd.merge(X_train, y_train,left_index=True, right_index=True)
y_train_pred = pd.DataFrame(y_train_pred)
y_train_pred.rename(columns={0 :'Predicted'}, inplace=True )
train = train.reset_index(drop=True).merge(y_train_pred.reset_index(drop=True),
left_index=True,right_index=True)
train['FP'] = np.where((train['Banknote']=="Forged") & (train['Predicted']=="Genuine"),1,0)
train[train.FP != 0]
if you have more than one classes in your classifier, you might want to use pandas-ml at that part. Confusion Matrix of pandas-ml give more detailed information. check that
The one liner to get true postives etc. out of the confusion matrix is to ravel it:
from sklearn.metrics import confusion_matrix
y_true = [1, 1, 0, 0]
y_pred = [1, 0, 1, 0]
tn, fp, fn, tp = confusion_matrix(y_true, y_pred).ravel()
print(tn, fp, fn, tp) # 1 1 1 1