Hadoop yarn node list shows slaves as localhost.localdomain:#somenumber. connection refuse exception

匿名 (未验证) 提交于 2019-12-03 08:52:47

问题:

I have got connection refuse exception from localhost.localdomain/127.0.0.1 to localhost.localdomain:55352 when trying to run wordcount program. yarn node -list gives

hduser@localhost:/usr/local/hadoop/etc/hadoop$ yarn node -list 15/05/27 07:23:54 INFO client.RMProxy: Connecting to ResourceManager at master/192.168.111.72:8040 Total Nodes:2          Node-Id         Node-State Node-Http-Address   Number-of-Running-Containers localhost.localdomain:32991         RUNNING localhost.localdomain:8042                             0 localhost.localdomain:55352         RUNNING localhost.localdomain:8042                             0 

master /etc/hosts:

127.0.0.1    localhost localhost.localdomain localhost4 localhost4.localdomain4 #127.0.1.1    ubuntu-Standard-PC-i440FX-PIIX-1996 192.168.111.72  master 192.168.111.65  slave1 192.168.111.66  slave2  # The following lines are desirable for IPv6 capable hosts ::1     ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters 

slave /etc/hosts:

127.0.0.1       localhost.localdomain localhost #127.0.1.1      ubuntu-Standard-PC-i440FX-PIIX-1996 192.168.111.72  master #192.168.111.65  slave1 #192.168.111.66  slave2  # The following lines are desirable for IPv6 capable hosts ::1     ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters 

What I understood is master is wrongly trying to connect to slaves on localhost. Please help me resolve this. Any suggestion is appreciated. Thank you.

回答1:

Here is the code how NodeManager builds the NodeId:

private NodeId buildNodeId(InetSocketAddress connectAddress,   String hostOverride) {   if (hostOverride != null) {     connectAddress = NetUtils.getConnectAddress(       new InetSocketAddress(hostOverride, connectAddress.getPort()));   }   return NodeId.newInstance(     connectAddress.getAddress().getCanonicalHostName(),     connectAddress.getPort()); } 

NodeManager tries to get the canonical hostname from the binding address, localhost will be gotten by given address 127.0.0.1.

So in your case, on the slave host, localhost.localdomain is the default host name for address 127.0.0.1, and the possible solution might be changing the first line of /etc/hosts on your slaves respectively to:

  127.0.0.1  slave1 localhost.localdomain localhost 

and

  127.0.0.1  slave2 localhost.localdomain localhost 


易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!