Classification is poor although term frequency is right

后端 未结 1 1635
别跟我提以往
别跟我提以往 2021-01-26 15:54

I am checking using the below function what are the most frequent words per category and then observe how some sentences would be classified. The results are surprisingly wrong

相关标签:
1条回答
  • 2021-01-26 16:18

    The order of names in cat variable and newsgroup_train.target_names is different. The labels assigned target_names are sorted, see here

    Output of: print(cat)

    ['sci.space','rec.autos','rec.motorcycles']

    print(newsgroups_train.target_names)

    ['rec.autos', 'rec.motorcycles', 'sci.space']

    You should this line:

    print(" - Predicted as: '{}'".format(cats[predicted]))

    to

    print(" - Predicted as: '{}'".format(newsgroup_train.target_names[predicted]))

    0 讨论(0)
提交回复
热议问题