how to Intialize the spark shell with a specific user to save data to hdfs by apache spark

…衆ロ難τιáo~ 提交于 2020-05-17 07:10:14

问题


  • im using ubuntu
  • im using spark dependency using intellij
  • Command 'spark' not found, but can be installed with: .. (when i enter spark in shell)
  • i have two user amine , and hadoop_amine (where hadoop hdfs is set)

when i try to save a dataframe to HDFS (spark scala):

procesed.write.format("json").save("hdfs://localhost:54310/mydata/enedis/POC/processed.json")

i got this error

Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=root, access=WRITE, inode="/mydata/enedis/POC":hadoop_amine:supergroup:drwxr-xr-x

回答1:


Try to change the permissions of the HDFS directory or change your spark user simply! For changing the directory permission you can use hdfs command line like this

hdfs dfs -chmod  ...

In spark-submit you can use the proxy-user option And at last, you can run the spark-submit or spark-shell with the proper user like this command:

sudo -u hadoop_amine spark-submit ...


来源:https://stackoverflow.com/questions/61582859/how-to-intialize-the-spark-shell-with-a-specific-user-to-save-data-to-hdfs-by-ap

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!