问题
I'm exploring some of the code examples in Mahout in Action in more detail. I have built a small test that computes the RMS of various algorithms applied to my data.
Of course, multiple parameters impact the RMS, but I don't understand the "unable to recommend in ... cases" message that is generated while running an evaluation.
Looking at StatsCallable.java, this is generated when an evaluator encounters a NaN response; Perhaps not enough data in the training set or the user's prefs to provide a recommendation.
It seems like the RMS score isn't impacted by a very large set of "unable to recommend" cases. Is that assumption correct? Should I be evaluating my algorithm not only on RMS but also the ratio of "unable to recommend" cases versus my overall training set?
I'd appreciate any feedback.
回答1:
Yes this essentially means there was no data at all on which to base an estimate. That's generally a symptom of data sparseness. It should be rare, and happen only for users with data that's very small or disconnected from others'.
I personally think it's not such a big deal unless it's a really significant percentage (20%+?) I'd worry more if you couldn't generate any recs at all for many users.
来源:https://stackoverflow.com/questions/10355031/evaluating-recommenders-unable-to-recommend-in-x-cases