Is there the possibility to run the Spark standalone cluster locally on just one machine (which is basically different from just developing jobs locally (i.e., lo
yes you can do it, launch one master and one worker node and you are good to go
launch master
./sbin/start-master.sh
launch worker
./bin/spark-class org.apache.spark.deploy.worker.Worker spark://localhost:7077 -c 1 -m 512M
run SparkPi example
./bin/spark-submit --class org.apache.spark.examples.SparkPi --master spark://localhost:7077 lib/spark-examples-1.2.1-hadoop2.4.0.jar
Apache Spark Standalone Mode Documentation
A small update as for the latest version (the 2.1.0), the default is to bind the master to the hostname, so when starting a worker locally use the output of hostname
:
./bin/spark-class org.apache.spark.deploy.worker.Worker spark://`hostname`:7077 -c 1 -m 512M
And to run an example, simply run the following command:
bin/run-example SparkPi
If you can't find the ./sbin/start-master.sh
file on your machine, you can start the master also with
./bin/spark-class org.apache.spark.deploy.master.Master
More simply,
./sbin/start-all.sh
On your local machine there will be master and one worker launched.
./bin/spark-submit \
--class org.apache.spark.examples.SparkPi \
--master spark://localhost:7077 \
examples/jars/spark-examples_2.12-3.0.1.jar 10000
A sample application is submitted. For monitoring via Web UI:
Master UI: http://localhost:8080
Worker UI: http://localhost:8081
Application UI: http://localhost:4040