Choosing an sklearn pipeline for classifying user text data
问题 I'm working on a machine learning application in Python (using the sklearn module), and am currently trying to decide on a model for performing inference. A brief description of the problem: Given many instances of user data, I'm trying to classify them into various categories based on relative keyword containment. It is supervised, so I have many, many instances of pre-classified data that are already categorized. (Each piece of data is between 2 and 12 or so words.) I am currently trying to