问题
I am trying to benchmark Hadoop2 MapReduce framework. It is NOT TeraSort. But testmapredsort
.
step-1 Create random data:
hadoop jar hadoop/ randomwriter -Dtest.randomwrite.bytes_per_map=100 -Dtest.randomwriter.maps_per_host=10 /data/unsorted-data
step-2 sort the random data created in step-1:
hadoop jar hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar sort /data/unsorted-data /data/sorted-data
step-3 check if the sorting by MR
works:
hadoop jar hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.2.0-tests.jar testmapredsort -sortInput /data/unsorted-data -sortOutput /data/sorted-data
I get the following error during step-3. I want to know how to fix this this error.
java.lang.Exception: java.io.IOException: Partitions do not match for record# 0 ! - '0' v/s '5'
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:403)
Caused by: java.io.IOException: Partitions do not match for record# 0 ! - '0' v/s '5'
at org.apache.hadoop.mapred.SortValidator$RecordStatsChecker$Map.map(SortValidator.java:266)
at org.apache.hadoop.mapred.SortValidator$RecordStatsChecker$Map.map(SortValidator.java:191)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:235)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:695)
14/08/18 11:07:39 INFO mapreduce.Job: Job job_local2061890210_0001 failed with state FAILED due to: NA
14/08/18 11:07:39 INFO mapreduce.Job: Counters: 23
File System Counters
FILE: Number of bytes read=1436271
FILE: Number of bytes written=1645526
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=1077294840
HDFS: Number of bytes written=0
HDFS: Number of read operations=13
HDFS: Number of large read operations=0
HDFS: Number of write operations=1
Map-Reduce Framework
Map input records=102247
Map output records=102247
Map output bytes=1328251
Map output materialized bytes=26
Input split bytes=102
Combine input records=102247
Combine output records=1
Spilled Records=1
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=22
Total committed heap usage (bytes)=198766592
File Input Format Counters
Bytes Read=1077294840
java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:836)
at org.apache.hadoop.mapred.SortValidator$RecordStatsChecker.checkRecords(SortValidator.java:367)
at org.apache.hadoop.mapred.SortValidator.run(SortValidator.java:579)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.mapred.SortValidator.main(SortValidator.java:594)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
at org.apache.hadoop.test.MapredTestDriver.run(MapredTestDriver.java:115)
at org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:123)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
EDIT:
hadoop fs -ls /data/unsorted-data
-rw-r--r-- 3 david supergroup 0 2014-08-14 12:45 /data/unsorted-data/_SUCCESS
-rw-r--r-- 3 david supergroup 1077294840 2014-08-14 12:45 /data/unsorted-data/part-m-00000
hadoop fs -ls /data/sorted-data
-rw-r--r-- 3 david supergroup 0 2014-08-14 12:55 /data/sorted-data/_SUCCESS
-rw-r--r-- 3 david supergroup 137763270 2014-08-14 12:55 /data/sorted-data/part-m-00000
-rw-r--r-- 3 david supergroup 134220478 2014-08-14 12:55 /data/sorted-data/part-m-00001
-rw-r--r-- 3 david supergroup 134219656 2014-08-14 12:55 /data/sorted-data/part-m-00002
-rw-r--r-- 3 david supergroup 134218029 2014-08-14 12:55 /data/sorted-data/part-m-00003
-rw-r--r-- 3 david supergroup 134219244 2014-08-14 12:55 /data/sorted-data/part-m-00004
-rw-r--r-- 3 david supergroup 134220252 2014-08-14 12:55 /data/sorted-data/part-m-00005
-rw-r--r-- 3 david supergroup 134224231 2014-08-14 12:55 /data/sorted-data/part-m-00006
-rw-r--r-- 3 david supergroup 134210232 2014-08-14 12:55 /data/sorted-data/part-m-00007
回答1:
Aside from the change in keys from test.randomwrite.bytes_per_map
and test.randomwriter.maps_per_host
to mapreduce.randomwriter.bytespermap
and mapreduce.randomwriter.mapsperhost
causing the settings to not get through to randomwriter, the core of the problem as indicated by the filenames you listed under /data/sorted-data
is that your sorted data consists of map outputs, whereas correctly sorted output only comes from reduce outputs; essentially, your sort
command is only performing the map portion of the sort, and never performing the merge in a subsequent reduce stage. Because of this, your testmapredsort
command is correctly reporting that the sort did not work.
Checking the code of Sort.java you can see that there is in fact no protection against num_reduces
somehow getting set to 0; the typical behavior of Hadoop MR is that setting the number of reduces to 0 indicates a "map only" job, where the map outputs go directly to HDFS rather than being intermediate outputs passed to reduce tasks. Here are the relevant lines:
85 int num_reduces = (int) (cluster.getMaxReduceTasks() * 0.9);
86 String sort_reduces = conf.get(REDUCES_PER_HOST);
87 if (sort_reduces != null) {
88 num_reduces = cluster.getTaskTrackers() *
89 Integer.parseInt(sort_reduces);
90 }
Now, in a normal setup, all of that logic using "default" settings should provide a nonzero number of reduces, such that the sort works. I was able to repro your problem by running:
hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar sort -r 0 /data/unsorted-data /data/sorted-data
using the -r 0
to force 0 reduces. In your case, more likely cluster.getMaxReduceTasks()
is returning 1 (or possibly even 0 if your cluster is broken). I don't know off the top of my head all the ways that method could return 1; it appears that simply setting mapreduce.tasktracker.reduce.tasks.maximum
to 1 doesn't apply to that method. Other factors that go into task capacity include numbers of cores and the amount of memory available.
Assuming your cluster is at least capable of 1 reduce task per TaskTracker, you can retry your sort step using -r 1
:
hadoop fs -rmr /data/sorted-data
hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar sort -r 1 /data/unsorted-data /data/sorted-data
来源:https://stackoverflow.com/questions/25369721/error-during-benchmarking-sort-in-hadoop2-partitions-do-not-match