Running Mahout from the command line (CLASSPATH)

假装没事ソ 提交于 2019-12-06 03:40:45

This is better asked at

Your classpath is missing compiled code in Mahout's examples module, which is where this class lives.

Better yet, have a look at this walkthrough:


If you put $MAHOUT_HOME/examples/target/classes is in the java CLASSPATH (as Sean mentions) this will work when running locally but you'll probably have to try the method below for a hadoop cluster deployment.

I found the following post very illuminating about how get the right classes in various configurations of mahout/hadoop.

The mahout script does not accept hadoop job parameters (like --libJar) in all cases although I hope it does in the future, especially where a parameter to the job is a classname (seq2sparse for instance).

What I had to do was copy my custom jar into $HADOOP_HOME/lib on the master node. Evidently a symlink does not work, it appears you have to copy each jar you want to the directory.

Then don't forget to stop and start hadoop because as the cloudera reference says it packages the libs at startup.

What I did is to set the HADOOP_CLASSPATH with my jar and all the mahout jar files as shown below.

export HADOOP_CLASSPATH=/home/xxx/my.jar:/opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/lib/mahout/mahout-core-0.7-cdh4.3.0.jar:/opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/lib/mahout/mahout-core-0.7-cdh4.3.0-job.jar:/opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/lib/mahout/mahout-examples-0.7-cdh4.3.0.jar:/opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/lib/mahout/mahout-examples-0.7-cdh4.3.0-job.jar:/opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/lib/mahout/mahout-integration-0.7-cdh4.3.0.jar:/opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/lib/mahout/mahout-math-0.7-cdh4.3.0.jar

Then I was able to run hadoop com.mycompany.mahout.CSVtoVector iris/nb/iris1.csv iris/nb/data/iris.seq

So you have to include all your jars and the mahout jar in the HADOOP_CLASSPATH and then you can just run your class with

hadoop <classname>
