Sharing data between master and reduce

前端 未结 2 450
自闭症患者
自闭症患者 2021-01-24 18:42

I need to perform aggregation using the results form all the reduce tasks. Basically the reduce task finds the sum and count and a value. I need to add all the sums and counts a

2条回答
  •  醉话见心
    2021-01-24 19:10

    Found the solution. I used counters

    http://diveintodata.org/2011/03/15/an-example-of-hadoop-mapreduce-counter/

    public class FlightData {

    //enum for counters used by reducers
    public static enum FlightCounters {
        FLIGHT_COUNT,
        FLIGHT_DELAY;
    }
    public static class MyReducer 
    extends Reducer {
    
        public void reduce(Text key, Iterable values, 
                Context context
                ) throws IOException, InterruptedException {
    
    
            delay1 = Float.parseFloat(origin[5]);
            delay2 = Float.parseFloat(dest[5]);
            context.getCounter(FlightCounters.FLIGHT_COUNT).increment(1);
            context.getCounter(FlightCounters.FLIGHT_DELAY)
            .increment((long) (delay1 + delay2));
    
        }
    }
    public static void main(String[] args) throws Exception{
        float flightCount, flightDelay;
        job.waitForCompletion(true);
        //get the final results updated in counter by all map and reduce tasks
        flightCount = job.getCounters()
                .findCounter(FlightCounters.FLIGHT_COUNT).getValue();
        flightDelay = job.getCounters()
                .findCounter(FlightCounters.FLIGHT_DELAY).getValue();
    }
    

    }

提交回复
热议问题