问题
There are a ton of tunable settings mentioned on Spark
configurations page. However as told here, the SparkSubmitOptionParser
attribute-name for a Spark
property can be different from that property's-name.
For instance, spark.executor.cores
is passed as --executor-cores
in spark-submit
.
Where can I find an exhaustive list of all tuning parameters of Spark
(along-with their SparkSubmitOptionParser
property name) that can be passed with spark-submit
command?
回答1:
While @suj1th's valuable inputs did solve my problem, I'm answering my own question to directly address my query.
You need not look up for
SparkSubmitOptionParser
's attribute-name for a givenSpark
property (configuration setting). Both will do just fine. However, do note that there's a subtle difference between there usage as shown below:spark-submit --executor-cores 2
spark-submit --conf spark.executor.cores=2
Both commands shown above will have same effect. The second method takes configurations in the format
--conf <key>=<value>
.Enclosing values in quotes (correct me if this is incorrect / incomplete)
(i) Values need not be enclosed in quotes (single
''
or double""
) of any kind (you still can if you want).(ii) If the value has a
space
character, enclose the entire thing in double quotes""
like"<key>=<value>"
as shown here.For a comprehensive list of all configurations that can be passed with
spark-submit
, just runspark-submit --help
In this link provided by @suj1th, they say that:
configuration values explicitly set on a SparkConf take the highest precedence, then flags passed to spark-submit, then values in the defaults file.
If you are ever unclear where configuration options are coming from, you can print out fine-grained debugging information by running spark-submit with the --verbose option.
Following two links from Spark
docs list a lot of configurations:
- Spark Configuration
- Running Spark on YARN
回答2:
In your case, you should actually load your configurations from a file, as mentioned in this document, instead of passing them as flags to spark-submit
. This relieves the overhead of mapping SparkSubmitArguments
to Spark configuration parameters. To quote from the above document:
Loading default Spark configurations this way can obviate the need for certain flags to
spark-submit
. For instance, if the spark.master property is set, you can safely omit the--master
flag fromspark-submit
. In general, configuration values explicitly set on aSparkConf
take the highest precedence, then flags passed tospark-submit
, then values in the defaults file.
来源:https://stackoverflow.com/questions/49381168/list-of-spark-submit-options