Finding hostname of slave nodes in hadoop during execution of running map-reduce

陌路散爱 提交于 2019-12-11 20:00:21

问题


I want to know how to execute map reduce code on Hadoop 2.9.0 multi-node cluster? I wanna understand which node process which input. Actually, How to find every part of input data is processed by which mapper? I executed following python code on master:

import sys
import socket

for line in sys.stdin:
    line = line.strip()
    words = line.split()
    for word in words:
        print('%s\t%s\t%s' % (word, 1, socket.gethostname()))

I used socket.gethostname() to finding hostname of nodes. I expecte output of this mapper be (e.g):

Bye     1   hadoopmaster
Goodbye 1   hadoopmaster
Hadoop  1   hadoopmaster
Hadoop  1   hadoopslave1
Hello   1   hadoopmaster
Hello   1   hadoopslave2

But is:

Bye     1   hadoopmaster
Goodbye 1   hadoopmaster
Hadoop  1   hadoopmaster
Hadoop  1   hadoopmaster
Hello   1   hadoopmaster
Hello   1   hadoopmaster

Is the code not running on the slave nodes?

来源:https://stackoverflow.com/questions/50656448/finding-hostname-of-slave-nodes-in-hadoop-during-execution-of-running-map-reduce

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!