I need to perform aggregation using the results form all the reduce tasks. Basically the reduce task finds the sum and count and a value. I need to add all the sums and counts a
Found the solution. I used counters
http://diveintodata.org/2011/03/15/an-example-of-hadoop-mapreduce-counter/
public class FlightData {
//enum for counters used by reducers
public static enum FlightCounters {
FLIGHT_COUNT,
FLIGHT_DELAY;
}
public static class MyReducer
extends Reducer {
public void reduce(Text key, Iterable values,
Context context
) throws IOException, InterruptedException {
delay1 = Float.parseFloat(origin[5]);
delay2 = Float.parseFloat(dest[5]);
context.getCounter(FlightCounters.FLIGHT_COUNT).increment(1);
context.getCounter(FlightCounters.FLIGHT_DELAY)
.increment((long) (delay1 + delay2));
}
}
public static void main(String[] args) throws Exception{
float flightCount, flightDelay;
job.waitForCompletion(true);
//get the final results updated in counter by all map and reduce tasks
flightCount = job.getCounters()
.findCounter(FlightCounters.FLIGHT_COUNT).getValue();
flightDelay = job.getCounters()
.findCounter(FlightCounters.FLIGHT_DELAY).getValue();
}
}