Spark submit to yarn as a another user

匿名 (未验证) 提交于 2019-12-03 08:54:24

问题:

Is it possible to submit a spark job to a yarn cluster and choose, either with the command line or inside the jar, which user will "own" the job?

The spark-submit will be launch from a script containing the user.

PS: is it still possible if the cluster has a kerberos configuration (and the script a keytab) ?

回答1:

For a non-kerberized cluster: export HADOOP_USER_NAME=zorro before submitting the Spark job will do the trick.
Make sure to unset HADOOP_USER_NAME afterwards, if you want to revert to your default credentials in the rest of the shell script (or in your interactive shell session).

For a kerberized cluster, the clean way to impersonate another account without trashing your other jobs/sessions (that probably depend on your default ticket) would be something in this line...

export KRB5CCNAME=FILE:/tmp/krb5cc_$(id -u)_temp_$$ kinit -kt ~/.protectedDir/zorro.keytab zorro@MY.REALM spark-submit ........... kdestroy 


回答2:

If your user exists, you can still launch your spark submit with su $my_user -c spark submit [...]

I am not sure about the kerberos keytab, but if you make a kinit with this user it should be fine.

If you can't use su because you don't want the password, I invite you to see this stackoverflow answer: how to run script as another user without password



回答3:

For a non-kerberized cluster you can add a Spark conf as:

--conf spark.yarn.appMasterEnv.HADOOP_USER_NAME=<user_name> 


标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!