问题
I'm working on using the REST interface to Hadoop's HDFS as a convenient way to store files over the network. To test I installed hadoop on my mac (10.8.5) following these instructions:
http://importantfish.com/how-to-install-hadoop-on-mac-os-x/
That worked like a charm and I'm able to start hadoop and run a basic test:
hadoop-examples-1.1.2.jar pi 10 100
Now, I'm using the python client to handle the HTTP requests to/from webhdfs:
http://pythonhosted.org/pywebhdfs/
But I'm stumbling on a basic permissions error when I try to create a directory:
from pywebhdfs.webhdfs import PyWebHdfsClient
hdfs = PyWebHdfsClient()
my_dir = 'user/hdfs/data/new_dir'
hdfs.make_dir(my_dir, permission=755)
Traceback (most recent call last):
File "", line 1, in
File "/Library/Python/2.7/site-packages/pywebhdfs/webhdfs.py", line 207, in make_dir
_raise_pywebhdfs_exception(response.status_code, response.text)
File "/Library/Python/2.7/site-packages/pywebhdfs/webhdfs.py", line 428, in _raise_pywebhdfs_exception
raise errors.PyWebHdfsException(msg=message)
pywebhdfs.errors.PyWebHdfsException: {"RemoteException":{"exception":"AccessControlException","javaClassName":"org.apache.hadoop.security.AccessControlException","message":"Permission denied: user=webuser, access=WRITE, inode=\"user\":mlmiller:supergroup:rwxr-xr-x"}}
I've also tried specifying the user as 'hdfs' instead of the python lib's defeault to 'webhdfs' but get the same result. After 30 minutes reading I gave up and realized I don't understand the interplay of hdfs users, hadoop security (which I enabled following the install isntructions) and my unix user and permissions.
回答1:
You need to have the PyWebHdfsClient user_name match a unix user that has permission to the directory you are trying to write to. The user that starts the namenode service is by default the "superuser"
I wrote the pywebhdfs client you are using in response to a need at work. If you have any issues or would like to ask for features on the client itself please leave an issue on github and I can address it.
https://github.com/ProjectMeniscus/pywebhdfs/issues
Thank you
回答2:
Figured this one out after stepping away and reading some more docs. webdhfs expects you to specify a user value that matches the unix user who launched hdfs from the shell. So the correct python is:
from pywebhdfs.webhdfs import PyWebHdfsClient
user = <specify_linux_user_who_launched_hadoop>
hdfs = PyWebHdfsClient(user_name=user)
my_dir = '%s/data/new_dir' % user
hdfs.make_dir(my_dir, permission=755)
来源:https://stackoverflow.com/questions/19012798/permissions-error-on-webhdfs