Difference in calling the job

后端 未结 2 984
情话喂你
情话喂你 2021-02-01 07:31

what is the difference between calling a mapreduce job from main() and from ToolRunner.run()? When we say that the main class say, MapReduce exte

2条回答
  •  闹比i
    闹比i (楼主)
    2021-02-01 08:16

    By using ToolRunner.run(), any hadoop application can handle standard command line options supported by hadoop. ToolRunner uses GenericOptionsParser internally. In short, the hadoop specific options which are provided command line are parsed and set into the Configuration object of the application. If you simply use main(), this wont happen automatically.

    eg. If you say:

    % hadoop MyHadoopApp -D mapred.reduce.tasks=3
    

    Then ToolRunner.run(new MyHadoopApp(), args) will automatically set the value parameter mapred.reduce.tasks to 3 in the Configuration object.

    There are NO additional privileges which we we get. Typically people don't use simply main() in hadoop jobs. Using ToolRunner.run() is a standard practice.

提交回复
热议问题