问题
I tried to install and build Spark 2.0.0 on Ubuntu VM with Ubuntu 16.04 as follows:
Install Java
sudo apt-add-repository ppa:webupd8team/java sudo apt-get update sudo apt-get install oracle-java8-installer
Install Scala
Go to their Downloads tab on their site: scala-lang.org/download/all.html
I used Scala 2.11.8.
sudo mkdir /usr/local/src/scala sudo tar -xvf scala-2.11.8.tgz -C /usr/local/src/scala/
Modify the
.bashrc
file and include the path for scala:export SCALA_HOME=/usr/local/src/scala/scala-2.11.8 export PATH=$SCALA_HOME/bin:$PATH
then type:
. .bashrc
Install git
sudo apt-get install git
Download and build spark
Go to: http://spark.apache.org/downloads.html
Download Spark 2.0.0 (Build from Source - for standalone mode).
tar -xvf spark-2.0.0.tgz cd into the Spark folder (that has been extracted).
now type:
./build/sbt assembly
After its done Installing, I get the message:
[success] Total time: 1940 s, completed...
followed by date and time...
Run Spark shell
bin/spark-shell
That's when all hell breaks loose and I start getting the error. I go into the assembly folder to look for a folder called target. But there's no such folder there. The only things visible in assembly are: pom.xml, README, and src.
I looked it up online for quite a while and I haven't been able to find a single concrete solution that would help solve the error. Can someone please provide explicit step-by-step instructions as to how to go about solving this ?!? It's driving me nuts now... (T.T)
Screenshot of the error:
回答1:
For some reason, Scala 2.11.8 is not working well while building but if I switch over to Scala 2.10.6 then it builds properly. I guess the reason I would need Scala in the first place is to get access to sbt to be able to build spark. Once its built, I need to direct myself to the spark folder and type:
build/sbt package
This will build the missing JAR files for me using Scala 2.11... kinda weird but that's how its working (I am assuming by looking at the logs).
Once spark builds again, type: bin/spark-shell (while being in the spark folder) and you'll have access to the spark shell.
回答2:
type sbt package in spark directory not in build directory.
回答3:
If your goal is really to build your custom Spark package from the sources you've downloaded from http://spark.apache.org/downloads.html, you should do the following instead:
./build/mvn -Phadoop-2.7,yarn,mesos,hive,hive-thriftserver -DskipTests clean install
You may want to read the official document Building Spark.
NB You don't have to install Scala and git packages to build Spark so you could have skipped "2. Install Scala" and "3. Install git" steps.
来源:https://stackoverflow.com/questions/39282434/how-to-build-spark-from-the-sources-from-the-download-spark-page