mahout

Hadoop Mahout Clustering

╄→гoц情女王★ 提交于 2019-12-24 17:25:35
问题 I am trying to apply canopy clustering in Mahout. I already converted a text file into sequence file. But i cannot view the sequence file. Anyways I thought of applying canopy clustering by giving the following command, hduser@ubuntu:/usr/local/mahout/trunk$ mahout canopy -i /user/Hadoop/mahout_seq/seqdata -o /user/Hadoop/clustered_data -t1 5 -t2 3 I got the following error, 16/05/10 17:02:03 INFO mapreduce.Job: Task Id : attempt_1462850486830_0008_m_000000_1, Status : FAILED Error: java.lang

Mahout IntDoubleProcedure NoClassDefFoundError

ε祈祈猫儿з 提交于 2019-12-24 06:43:29
问题 I'm using my school's server which already have hadoop and mahout. But I need to parse csv to vector. So I tried someone else code from git. But I got the following exception which I can't solve. dcmac04:dir username$ java -jar BigDataNaiveBayes_fat.jar May 30, 2015 1:48:17 AM org.apache.hadoop.util.NativeCodeLoader <clinit> WARNING: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable May 30, 2015 1:48:17 AM org.apache.hadoop.io.compress

Exception in thread “main” java.lang.NoClassDefFoundError: org/apache/mahout/math/Vector

北城以北 提交于 2019-12-24 05:25:30
问题 I write java code to convert CSV file to vectors to use in classification task using random forest algorithm.I use mahout 0.10.0, hadoop 2.6.0 and eclipse.Then, I try to run this code from cmd using that command: hadoop jar /path to my jar/CSVToVector.jar com.classification.csvtovector.CSVToVector But I got this error: Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/mahout/math/Vector at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348)

Running Mahout Locally getting ClassNotFoundException for MahoutDriver

北城余情 提交于 2019-12-23 12:38:39
问题 I am trying to run Mahout locally (without Hadoop) on a Windows 8 Machine. I realize this is not the optimal set up but that's what I've got to work with. When I try to run bin/mahout I get the following error: $ bin/mahout MAHOUT_LOCAL is set, so we don't add HADOOP_CONF_DIR to classpath. no HADOOP_HOME set, running locally Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/mahout/dri ver/MahoutDriver Caused by: java.lang.ClassNotFoundException: org.apache.mahout.driver

Datasets for Apache Mahout

旧城冷巷雨未停 提交于 2019-12-22 08:56:20
问题 I am looking for datasets that can be used for implementing recommendation system usecase of Apache Mahout. I know of only MovieLens Data Sets from GroupLens Research group. Anyone knows any other datasets that can be used for recommendation system implementation? I am particularly interested in item-based data sets though other datasets are most welcome. 回答1: this is Sebastian from Mahout. There is a dataset from a czech dating website available that might be of interest to you: http://www

how can I compile/using mahout for hadoop 2.0?

痞子三分冷 提交于 2019-12-22 08:51:38
问题 The latest release mahout 0.9 is only built on hadoop 1.x. (mvn clean install) How can I compile mahout for hadoop 2.0.x? Because When I was running the commands: hadoop jar mahout-examples-0.9-SNAPSHOT-job.jar org.apache.mahout.cf.taste.hadoop.item.RecommenderJob -s SIMILARITY_COOCCURENCE -i test -o result I always got the error message IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected. Thanks! 回答1: To compile Mahout to work with 2.x

ClassNotFoundException org.apache.mahout.math.VectorWritable

*爱你&永不变心* 提交于 2019-12-22 00:25:17
问题 I'm trying to turn a csv file into sequence files so that I can train and run a classifier across the data. I have a job java file that I compile and then jar into the mahout job jar. And when I try to hadoop jar my job in the mahout jar, I get a java.lang.ClassNotFoundException: org.apache.mahout.math.VectorWritable . I'm not sure why this is because if I look in the mahout jar, that class is indeed present. Here are the steps I'm doing #get new copy of mahout jar rm iris.jar cp /home

Mahout runs out of heap space

久未见 提交于 2019-12-21 16:50:25
问题 I am running NaiveBayes on a set of tweets using Mahout. Two files, one 100 MB and one 300 MB. I changed JAVA_HEAP_MAX to JAVA_HEAP_MAX=-Xmx2000m ( earlier it was 1000). But even then, mahout ran for a few hours ( 2 to be precise) before it complained of heap space error. What should i do to resolve ? Some more info if it helps : I am running on a single node, my laptop infact and it has 3GB of RAM (only) . Thanks. EDIT: I ran it the third time with <1/2 of the data that i used the first time

Why is Maven trying to compile my code as -source 1.3?

一世执手 提交于 2019-12-21 12:19:55
问题 I get this error mvn -e package in Ubuntu 12.04: [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.0.2:compile (default-compile) on project HadoopSkeleton: Compilation failure: Compilation failure: [ERROR] /home/jesvin/dev/hadoop/HadoopMahoutSkeleton-master/src/main/java/HadoopSkeleton/App.java:[22,8] error: generics are not supported in -source 1.3 [ERROR] [ERROR] (use -source 5 or higher to enable generics) [ERROR] /home/jesvin/dev/hadoop/HadoopMahoutSkeleton

Run cvb in mahout 0.8

送分小仙女□ 提交于 2019-12-20 10:57:13
问题 The current Mahout 0.8-SNAPSHOT includes a Collapsed Variational Bayes (cvb) version for Topic Modeling and removed the Latent Dirichlet Analysis (lda) approach, because cvb can be parallelized way better. Unfortunately there is only documentation for lda on how to run an example and generate meaningful output. Thus, I want to: preprocess some texts correctly run the cvb0_local version of cvb inspect the results by looking at the top n words in each of the generated topics 回答1: So here are