I am working on a classification model in spark. Customer ID is the primary key of the dataset. Finally I have to do prediction for each customer ID My questions: