问题
I am having issues running Apache Spark 1.0.1 within a Play! app. Currently, I am trying to run Spark within the Play! application and use some of the basic Machine Learning within Spark.
Here's my app creation:
def sparkFactory: SparkContext = {
val logFile = "public/README.md" // Should be some file on your system
val driverHost = "localhost"
val conf = new SparkConf(false) // skip loading external settings
.setMaster("local[4]") // run locally with enough threads
.setAppName("firstSparkApp")
.set("spark.logConf", "true")
.set("spark.driver.host", s"$driverHost")
new SparkContext(conf)
}
And here's an error when I try to do some basic discovery of a Tall and Skinny Matrix:
[error] o.a.s.e.ExecutorUncaughtExceptionHandler - Uncaught exception in thread Thread[Executor task launch worker-3,5,main]
java.lang.NoSuchMethodError: breeze.linalg.DenseVector$.dv_v_ZeroIdempotent_InPlaceOp_Double_OpAdd()Lbreeze/linalg/operators/BinaryUpdateRegistry;
at org.apache.spark.mllib.linalg.distributed.RowMatrix$$anonfun$5.apply(RowMatrix.scala:313) ~[spark-mllib_2.10-1.0.1.jar:1.0.1]
at org.apache.spark.mllib.linalg.distributed.RowMatrix$$anonfun$5.apply(RowMatrix.scala:313) ~[spark-mllib_2.10-1.0.1.jar:1.0.1]
at scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:144) ~[scala-library-2.10.4.jar:na]
at scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:144) ~[scala-library-2.10.4.jar:na]
at scala.collection.Iterator$class.foreach(Iterator.scala:727) ~[scala-library-2.10.4.jar:na]
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) ~[scala-library-2.10.4.jar:na]
The error above is triggered by the following:
def computePrincipalComponents(datasetId: String) = Action {
val datapoints = DataPoint.listByDataset(datasetId)
// load the data into spark
val rows = datapoints.map(_.data).map { row =>
row.map(_.toDouble)
}
val RDDRows = WorkingSpark.context.makeRDD(rows).map { line =>
Vectors.dense(line)
}
val mat = new RowMatrix(RDDRows)
val result = mat.computePrincipalComponents(mat.numCols().toInt)
Ok(result.toString)
}
It looks like a dependency issue, but no idea where it starts. Any ideas?
回答1:
Ah this was indeed caused by a dependency conflict. Apparently the new Spark uses new Breeze methods that were not available in a version I had pulled in. By removing Breeze from my Play! Build file I was able to run the function above just fine.
For those interested, here's the output:
-0.23490049167080018 0.4371989078912155 0.5344916752692394 ... (6 total)
-0.43624389448418854 0.531880914138611 0.1854269324452522 ...
-0.5312372137092107 0.17954211389001487 -0.456583286485726 ...
-0.5172743086226219 -0.2726152326516076 -0.36740474569706394 ...
-0.3996400343756039 -0.5147253632175663 0.303449047782936 ...
-0.21216780828347453 -0.39301803119012546 0.4943679121187219 ...
来源:https://stackoverflow.com/questions/24855776/apache-spark-java-lang-nosuchmethoderror-breeze-linalg-densevector