Running Mahout from the command line (CLASSPATH)

社会主义新天地 提交于 2019-12-07 16:10:29

问题


Complied Mahout successfully under Windows using Maven.

I'm trying to run one of the examples from the command line and I don't get what I am doing wrong. Seems like a CLASSPATH problem.

Let's say I want to run the GroupLensRecommenderEvaluatorRunner example. I go to the folder with the GroupLensRecommenderEvaluatorRunner.class file in it and execute:

java -cp C:/mahout/core/target/classes;. 

org.apache.mahout.cf.taste.example.grouplens.GroupLensRecommenderEvaluatorRunner

It gives me the NoClassDefFoundError exception for the GroupLensRecommenderEvaluatorRunner class.

Is the path for -cp wrong?

btw, for those who aren't familiar with mahout,

org.apache.mahout.cf.taste.example.grouplens

is the package of the GroupLensRecommenderEvaluatorRunner class. javadoc

thanks guys.

p.s - I first looked on previous stackoverflow questions on CLASSPATH and followed the given solutions, before asking this question.


回答1:


This is better asked at user@mahout.apache.org.

Your classpath is missing compiled code in Mahout's examples module, which is where this class lives.

Better yet, have a look at this walkthrough: https://cwiki.apache.org/confluence/display/MAHOUT/Recommender+Documentation




回答2:


If you put $MAHOUT_HOME/examples/target/classes is in the java CLASSPATH (as Sean mentions) this will work when running locally but you'll probably have to try the method below for a hadoop cluster deployment.

I found the following post very illuminating about how get the right classes in various configurations of mahout/hadoop.

http://www.cloudera.com/blog/2011/01/how-to-include-third-party-libraries-in-your-map-reduce-job/

The mahout script does not accept hadoop job parameters (like --libJar) in all cases although I hope it does in the future, especially where a parameter to the job is a classname (seq2sparse for instance).

What I had to do was copy my custom jar into $HADOOP_HOME/lib on the master node. Evidently a symlink does not work, it appears you have to copy each jar you want to the directory.

Then don't forget to stop and start hadoop because as the cloudera reference says it packages the libs at startup.




回答3:


What I did is to set the HADOOP_CLASSPATH with my jar and all the mahout jar files as shown below.

export HADOOP_CLASSPATH=/home/xxx/my.jar:/opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/lib/mahout/mahout-core-0.7-cdh4.3.0.jar:/opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/lib/mahout/mahout-core-0.7-cdh4.3.0-job.jar:/opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/lib/mahout/mahout-examples-0.7-cdh4.3.0.jar:/opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/lib/mahout/mahout-examples-0.7-cdh4.3.0-job.jar:/opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/lib/mahout/mahout-integration-0.7-cdh4.3.0.jar:/opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/lib/mahout/mahout-math-0.7-cdh4.3.0.jar

Then I was able to run hadoop com.mycompany.mahout.CSVtoVector iris/nb/iris1.csv iris/nb/data/iris.seq

So you have to include all your jars and the mahout jar in the HADOOP_CLASSPATH and then you can just run your class with

hadoop <classname>



来源:https://stackoverflow.com/questions/3571486/running-mahout-from-the-command-line-classpath

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!