Classify data using Apache Mahout

后端 未结 2 1818
悲&欢浪女
悲&欢浪女 2021-02-04 11:07

I am trying to solve a simple classification problem.

The Problem:
I have a set of text and I have to categorize them based on the content.

Solution usin

2条回答
  •  被撕碎了的回忆
    2021-02-04 11:45

    I am having a similar problem.

    Running

    bin/mahout org.apache.mahout.classifier.Classify --path  --classify  --encoding UTF-8 --analyzer org.apache.mahout.vectorizer.DefaultAnalyzer --defaultCat unknown --gramSize 1 --classifierType bayes --dataSource hdfs
    

    will classify a text file based on the model.

    This might get you a bit further forward, but I'm guessing that, like me, you want to classify a whole load of documents and you want the output in a useful format.

    Might have to program a bit of java to do this. Someone has an example that looks like it will do what I want at https://bitbucket.org/jaganadhg/blog/src/tip/bck9/java/src/org/bc/kl/ClassifierDemo.java

提交回复
热议问题