I am running a spark cluster over C++ code wrapped in python. I am currently testing different configurations of multi-threading options (at Python level or Spark level).
To make sure how many workers started on each slave, open web browser, type http://master-ip:8080, and see the workers
section about how many workers has been started exactly, and also which worker on which slave. (I mention these above because I am not sure what do you mean by saying '4 slaves per node')
By default, spark would start exact 1 worker on each slave unless you specify
SPARK_WORKER_INSTANCES=n
in conf/spark-env.sh, where n is the number of worker instance you would like to start on each slave.
When you submit a spark job through spark-submit, spark would start an application driver and several executors for your job.
--total-executor-cores
you specified would limit the total cores that is available to this application.