StackOverflow-error when applying pyspark ALS's “recommendProductsForUsers” (although cluster of >300GB Ram available)
问题 Looking for expertise to guide me on issue below. Background: I'm trying to get going with a basic PySpark script inspired on this example As deploy infrastructure I use a Google Cloud Dataproc Cluster. Cornerstone in my code is the function "recommendProductsForUsers" documented here which gives me back the top X products for all users in the model Issue I incur The ALS.Train script runs smoothly and scales well on GCP (Easily >1mn customers). However, applying the predictions: i.e. using