Similarity function for Mahout boolean user-based recommender

孤人 提交于 2019-12-06 06:43:30

(I'm the developer.) If I was stranded on a desert island with just one similarity metric for data without ratings/prefs, it would be log-likelihood. I would generally expect it to be the better similarity metric.

The problem with the test you're doing is that, perhaps not at all obviously, it's not meaningful for this kind of recommender / data. RMSE is root-mean-square-error, and it's comparing the actual vs predicted rating for held-out test data. But you have no ratings. They're all "1.0". Really, RMSE is always 0!

It's not, since to have anything to rank on, these recommenders will rank by some meaningful function of the similarities. But they are not estimating ratings / prefs at all. So, RMSE means squat here.

The only metric you can really use is a precision/recall test in this case, I think. Even that is problematic. This and more fun topics are covered in a book which I will shamelessly promote: Mahout in Action

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!