问题
- im using ubuntu
- im using spark dependency using intellij
- Command 'spark' not found, but can be installed with: .. (when i enter spark in shell)
- i have two user amine , and
hadoop_amine
(where hadoop hdfs is set)
when i try to save a dataframe to HDFS (spark scala):
procesed.write.format("json").save("hdfs://localhost:54310/mydata/enedis/POC/processed.json")
i got this error
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=root, access=WRITE, inode="/mydata/enedis/POC":hadoop_amine:supergroup:drwxr-xr-x
回答1:
Try to change the permissions of the HDFS
directory or change your spark user simply!
For changing the directory permission you can use hdfs
command line like this
hdfs dfs -chmod ...
In spark-submit
you can use the proxy-user
option
And at last, you can run the spark-submit
or spark-shell
with the proper user like this command:
sudo -u hadoop_amine spark-submit ...
来源:https://stackoverflow.com/questions/61582859/how-to-intialize-the-spark-shell-with-a-specific-user-to-save-data-to-hdfs-by-ap