问题
I am creating HDInsights cluster on Azure according to this desciption
Now I would like to set up spark custom parameter, for example spark.yarn.appMasterEnv.PYSPARK3_PYTHON or spark_daemon_memory in time of cluster provisioning.
Is it possible to setup using Data Factory/Automation Account? I can not find any example doing this.
Thanks
回答1:
You can use SparkConfig
in Data Factory to pass these configurations to Spark.
For example:
"typeProperties": {
...
"sparkConfig": {
"spark.submit.pyFiles": "/dist/package_name-1.0.0-py3.5.egg",
"spark.yarn.appMasterEnv.PYSPARK_PYTHON": "/usr/bin/anaconda/envs/py35/bin/python3"
}
}
This way you can specify all Spark configs that are listed in docs here.
来源:https://stackoverflow.com/questions/46814186/how-to-setup-custom-spark-parameter-in-hdinsights-cluster-with-data-factory