Multilabel, Multiclass accuracy : how to calculate accuracy for Multiclass, Multilabel classification?

◇◆丶佛笑我妖孽 提交于 2019-12-22 00:28:50

问题


I am working on a multilabel and multiclass classification framework, I want to add matrices for multilabel and multiclass accuracy calculation.

Here is demo data :

predicted_labels = [[1,0,0,0,1],[1,0,0,0,1],[1,0,0,0,1],[1,0,0,0,1],[1,0,0,0,1],[1,0,1,0,1]]
true_labels      = [[1,1,0,0,1],[1,0,0,1,1],[1,0,0,0,1],[1,1,1,0,1],[1,0,0,0,1],[1,0,0,0,1]]

Most popular accuracy matrices for multi-label, multi-class classification are :

  1. Hamming score
  2. Hamming loss
  3. Subset accuracy

The code for the above three is :

def hamming_score(y_true, y_pred, normalize=True, sample_weight=None):
    '''
    Compute the Hamming score (a.k.a. label-based accuracy) for the multi-label case

    '''
    acc_list = []
    for i in range(y_true.shape[0]):
        set_true = set( np.where(y_true[i])[0] )
        set_pred = set( np.where(y_pred[i])[0] )
        #print('\nset_true: {0}'.format(set_true))
        #print('set_pred: {0}'.format(set_pred))
        tmp_a = None
        if len(set_true) == 0 and len(set_pred) == 0:
            tmp_a = 1
        else:
            tmp_a = len(set_true.intersection(set_pred))/\
                    float( len(set_true.union(set_pred)) )
        #print('tmp_a: {0}'.format(tmp_a))
        acc_list.append(tmp_a)

    return  { 'hamming_score' : np.mean(acc_list) , 
              'subset_accuracy' : sklearn.metrics.accuracy_score(y_true, y_pred, normalize=True, sample_weight=None), 
              'hamming_loss' : sklearn.metrics.hamming_loss(y_true, y_pred)}

But I was looking for f1-score for multilabel classification so I tried to use sklearn f1-score :

print(f1_score(demo, true, average='micro'))

But it gave me the error :

> ValueError: multiclass-multioutput is not supported

I converted the data into np array and use f1_score again:

print(f1_score(np.array(true_labels),np.array(predicted_labels), average='micro'))

Then I am getting the accuracy :

0.8275862068965517

I tried one more experiment, I used one-one example from true and predicted labels and used f1-score over that and then took the mean of that :

accuracy_score = []

for tru,pred in zip (true_labels, predicted_labels):
    accuracy_score.append(f1_score(tru,pred,average='micro'))

print(np.mean(accuracy_score))

output:

0.8333333333333335

Accuracy is different

Why it's not working on list of list but working on np array and which method is correct, taking one by one example and mean or using numpy array with all samples?

What other matrices are available for multilabel classification accuracy calculation?


回答1:


You can check this answer and other answers which is already discussed.



来源:https://stackoverflow.com/questions/59195168/multilabel-multiclass-accuracy-how-to-calculate-accuracy-for-multiclass-mult

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!