I\'m trying to tune the parameters of an ALS matrix factorization model that uses implicit data. For this, I\'m trying to use pyspark.ml.tuning.CrossValidator to run through
Very late to the party here, but I'll post in case anyone stumbles upon this question like I did.
I was getting a similar error when trying to use CrossValidator
with an ALS model. I resolved it by setting the coldStartStrategy parameter in ALS
to "drop". That is:
alsImplicit = ALS(implicitPrefs=True, coldStartStrategy="drop")
and keep the rest of the code the same.
I expect what was happening in my example is that the cross-validation splits created scenarios where I had items in the validation set that did not appear in the training set, which results in NaN prediction values. The best solution is to drop the NaN values when evaluating, as described in the documentation.
I don't know if we were getting the same error so can't guarantee this would solve OP's problem, but it's good practice to set coldStartStrategy="drop" for cross validation anyway.
Note: my error message was "Params must be either a param map or a list/tuple of param maps", which didn't seem to imply an issue with the coldStartStrategy parameter or NaN values but this solution resolved the error.