Hadoop jobs fail when submitted by users other than yarn (MRv2) or mapred (MRv1)

前端 未结 2 918
無奈伤痛
無奈伤痛 2021-01-13 12:38

I am running a test cluster running MRv1 (CDH5) paired with LocalFileSystem, and the only user I am able to run jobs as is mapred (as mapred is the user starting the jobtrac

相关标签:
2条回答
  • 2021-01-13 12:52

    This worked for me, I just set this property in MR v1:

    <property>
        <name>hadoop.security.authorization</name>
        <value>simple</value>
      </property>
    

    Please go through this:

    Access Control Lists ${HADOOP_CONF_DIR}/hadoop-policy.xml defines an access control list for each Hadoop service. Every access control list has a simple format:

    The list of users and groups are both comma separated list of names. The two lists are separated by a space.

    Example: user1,user2 group1,group2.

    Add a blank at the beginning of the line if only a list of groups is to be provided, equivalently a comman-separated list of users followed by a space or nothing implies only a set of given users.

    A special value of * implies that all users are allowed to access the service.

    Refreshing Service Level Authorization Configuration The service-level authorization configuration for the NameNode and JobTracker can be changed without restarting either of the Hadoop master daemons. The cluster administrator can change ${HADOOP_CONF_DIR}/hadoop-policy.xml on the master nodes and instruct the NameNode and JobTracker to reload their respective configurations via the -refreshServiceAcl switch to dfsadmin and mradmin commands respectively.

    Refresh the service-level authorization configuration for the NameNode:

    $ bin/hadoop dfsadmin -refreshServiceAcl

    Refresh the service-level authorization configuration for the JobTracker:

    $ bin/hadoop mradmin -refreshServiceAcl

    Of course, one can use the security.refresh.policy.protocol.acl property in ${HADOOP_CONF_DIR}/hadoop-policy.xml to restrict access to the ability to refresh the service-level authorization configuration to certain users/groups.

    Examples Allow only users alice, bob and users in the mapreduce group to submit jobs to the MapReduce cluster:

    <property>
         <name>security.job.submission.protocol.acl</name>
         <value>alice,bob mapreduce</value>
    </property>
    

    Allow only DataNodes running as the users who belong to the group datanodes to communicate with the NameNode:

    <property>
         <name>security.datanode.protocol.acl</name>
         <value>datanodes</value>
    </property>
    Allow any user to talk to the HDFS cluster as a DFSClient:
    
    <property>
         <name>security.client.protocol.acl</name>
         <value>*</value>
    </property>
    
    0 讨论(0)
  • 2021-01-13 13:07

    You need to be setting up a staging directory for each user in the cluster. This is not as complicated as it sounds.

    Check the following properties:

    <property>
    <name>hadoop.tmp.dir</name>
    <value>/tmp/hadoop-${user.name}</value>
    <source>core-default.xml</source>
    </property>
    

    This basically setups a tmp directory for each user.

    Tie this to your staging directory :

    <property>
    <name>mapreduce.jobtracker.staging.root.dir</name>
    <value>${hadoop.tmp.dir}/mapred/staging</value>
    <source>mapred-default.xml</source>
    </property>
    

    Let me know if this works or if it already setup this way.

    These properties should be in yarn-site.xml - if i remember correctly.

    0 讨论(0)
提交回复
热议问题