I am able to run Spark
job using BashOperator
but I want to use SparkSubmitOperator
for it using Spark
standalone mod
You can either create a new connection using the Airflow Web UI or change the spark-default
connection.
Master can be local
, yarn
, spark://HOST:PORT
, mesos://HOST:PORT
and k8s://https://
.
You can also supply the following commands in the extras:
{"queue": "root.default", "deploy_mode": "cluster", "spark_home": "", "spark_binary": "spark-submit", "namespace": "default"}
Either the "spark-submit" binary should be in the PATH or the spark-home is set in the extra on the connection.