When I run the code
val home = "/Users/adremja/Documents/Kaggle/outbrain" val documents_categories = sc.textFile(home + "/documents_categories.csv") documents_categories take(10) foreach println
in spark-shell it works perfectly
scala> val home = "/Users/adremja/Documents/Kaggle/outbrain" home: String = /Users/adremja/Documents/Kaggle/outbrain scala> val documents_categories = sc.textFile(home + "/documents_categories.csv") documents_categories: org.apache.spark.rdd.RDD[String] = /Users/adremja/Documents/Kaggle/outbrain/documents_categories.csv MapPartitionsRDD[21] at textFile at <console>:26 scala> documents_categories take(10) foreach println document_id,category_id,confidence_level 1595802,1611,0.92 1595802,1610,0.07 1524246,1807,0.92 1524246,1608,0.07 1617787,1807,0.92 1617787,1608,0.07 1615583,1305,0.92 1615583,1806,0.07 1615460,1613,0.540646372
However when I try to run in the Zeppelin I get an error
java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.rdd.RDDOperationScope$ at org.apache.spark.SparkContext.withScope(SparkContext.scala:679) at org.apache.spark.SparkContext.textFile(SparkContext.scala:797) ... 46 elided
Do you have any idea where is the problem?
I have spark 2.0.1 from homebrew (I linked it in zeppelin-env.sh as SPARK_HOME) and Zeppelin 0.6.2 binary from Zeppelin's website.