问题
I'm using ALS algorithm (implicitPrefs = True)in Spark (Recommendation system algorithm). Normally, after run this algorithm, value predict must be from 0 to 1. But i received value greater than 1
"usn" : 72164,
"recommendations" : [
{
"item_code" : "C1346",
"rating" : 0.756096363067627
},
{
"item_code" : "C0117",
"rating" : 0.966064214706421
},
{
"item_code" : "I0009",
"rating" : 1.00000607967377
},
{
"item_code" : "C0102",
"rating" : 0.974934458732605
},
{
"item_code" : "I0853",
"rating" : 1.03272235393524
},
{
"item_code" : "C0103",
"rating" : 0.928574025630951
}
]
I don't understand why or what it is have rating value greater than 1 ("rating" : 1.00000607967377 and "rating" : 1.03272235393524)
Some question similar but i still don't understand: MLLib spark -ALStrainImplicit value more than 1
Anybody help me explain abnormal value
回答1:
Don't worry about that ! There is nothing wrong with ALS
.
Nevertheless, the prediction scores returned by ALS with implicit feedbacks with Apache Spark aren't normalized to fit be between [0,1], like you saw. You might even get negative values sometimes. (more on that here.)
ALS
uses stochastic gradient descent and approximations to compute (and re-compute) users and item factors on each step to minimize the cost function which allows it to scale.
As a matter of fact, normalizing those scores isn't relevant. The reason for this is actually that those scores doesn't mean much on their own.
You can't use RMSE
per example on those scores to evaluate the performance of your recommendations. If you are interested in evaluating this type of recommenders, I advice you to read my answer on How can I evaluate the implicit feedback ALS algorithm for recommendations in Apache Spark?
There is many techniques used in research or/and the industry to deal with such types of results. e.g You can binarize predictions per say using a threshold
.
来源:https://stackoverflow.com/questions/46904078/spark-als-recommendation-system-have-value-prediction-greater-than-1