What does recall mean in Machine Learning?

前端 未结 6 1065
一向
一向 2021-01-31 10:42

I know that the meaning of recall in search engine, but what\'s the meaning of recall of a classifier, e.g. bayes classifier? please give a an example, thanks.

for examp

相关标签:
6条回答
  • 2021-01-31 11:15

    These terminologies actually come from signal detection theory. For details, see http://en.wikipedia.org/wiki/Receiver_operating_characteristic

    On the right, under "Terminology and derivations from a confusion matrix".

    0 讨论(0)
  • 2021-01-31 11:27

    Recall literally is how many of the true positives were recalled (found), i.e. how many of the correct hits were also found.

    Precision (your formula is incorrect) is how many of the returned hits were true positive i.e. how many of the found were correct hits.

    It's pretty straightforward, actually.

    0 讨论(0)
  • 2021-01-31 11:27

    I found the explanation of Precision and Recall from Wikipedia very useful:

    Suppose a computer program for recognizing dogs in photographs identifies 8 dogs in a picture containing 12 dogs and some cats. Of the 8 dogs identified, 5 actually are dogs (true positives), while the rest are cats (false positives). The program's precision is 5/8 while its recall is 5/12. When a search engine returns 30 pages only 20 of which were relevant while failing to return 40 additional relevant pages, its precision is 20/30 = 2/3 while its recall is 20/60 = 1/3.

    So, in this case, precision is "how useful the search results are", and recall is "how complete the results are".

    0 讨论(0)
  • 2021-01-31 11:27

    Precision in ML is the same as in Information Retrieval.

    recall = TP / (TP + FN)
    precision = TP / (TP + FP)
    

    (Where TP = True Positive, TN = True Negative, FP = False Positive, FN = False Negative).

    It makes sense to use these notations for binary classifier, usually the "positive" is the less common classification. Note that the precision/recall metrics is actually the specific form where #classes=2 for the more general confusion matrix.

    Also, your notation of "precision" is actually accuracy, and is (TP+TN)/ ALL

    0 讨论(0)
  • 2021-01-31 11:27

    Giving you an example. Imagine we have a machine learning model which can detect cat vs dog. The actual label which is provided by human is called the ground-truth. Again the output of your model is called the prediction. Now look at the following table:

    ExampleNo        Ground-truth        Model's Prediction
       0                 Cat                   Cat
       1                 Cat                   Dog
       2                 Cat                   Cat
       3                 Dog                   Cat
       4                 Dog                   Dog
    

    Say we want to find recall for the class cat. By definition recall means the percentage of a certain class correctly identified (from all of the given examples of that class). So for the class cat the model correctly identified it for 2 times (in example 0 and 2). But does it mean actually there are only 2 cats? No! In reality there are 3 cats in the ground truth (human labeled). So what is the percentage of correct identification of this certain class? 2 out of 3 that is (2/3) * 100 = 66.67% or 0.667 if you normalize it within 1. Here is another prediction of cat in example 3 but it is not a correct prediction and hence, we are not considering it.

    Now coming to mathematical formulation. First understand two terms:

    TP (True positive): Predicting something positive when it is actually positive. If cat is our positive example then predicting something a cat when it is actually a cat.

    FN (False negative): Predicting something negative when it is not actually negative.

    Now for a certain class this classifier's output can be of two types: Cat or Dog (Not Cat). So the number correct identification is the number of True positive (TP). Again total number of examples of that class in ground-truth will be TP + FN. Because out of all cats the model either detected them correctly (TP) or didn't detect them correctly (FN i.e, the model falsely said Negative (Non Cat) when it was actually positive (Cat)). So For a certain class TP + FN denotes the total number of examples available in the ground truth of that class. So the formula is:

    Recall = TP / (TP + FN)
    

    Similarly recall can be calculated for Dog as well. At that time think the Dog as the positive class and the Cat as negative classes.

    So for any number of classes to find recall of a certain class take the class as the positive class and take the rest of the classes as the negative classes and use the formula to find recall. Continue the process for each of the classes to find recall for all of them.

    If you want to learn about precision as well then go here: https://stackoverflow.com/a/63121274/6907424

    0 讨论(0)
  • 2021-01-31 11:41

    In very simple language: For example, in a series of photos showing politicians, how many times was the photo of politician XY was correctly recognised as showing A. Merkel and not some other politician?

    • precision is the ratio of how many times ANOTHER person was recognized (false positives) : (Correct hits) / (Correct hits) + (false positives)

    • recall is the ratio of how many times the name of the person shown in the photos was incorrectly recognized ('recalled'): (Correct calls) / (Correct calls) + (false calls)

    0 讨论(0)
提交回复
热议问题