combineByKey算子
包括createCombiner(累加器),mergeValue(将该键的累加器对应的当前值与新的值合并),mergeCombiners(将各个分区的结果进行合并)。
Function<ScoreDetail, Tuple2<Float, Integer>> createCombine = new Function<ScoreDetail, Tuple2<Float, Integer>>() {
@Override
public Tuple2<Float, Integer> call(ScoreDetail scoreDetail) throws Exception {
return new Tuple2<>(scoreDetail.score, 1);
}
};
Function2<Tuple2<Float, Integer>, ScoreDetail, Tuple2<Float, Integer>> mergeValue = new Function2<Tuple2<Float, Integer>, ScoreDetail, Tuple2<Float, Integer>>() {
@Override
public Tuple2<Float, Integer> call(Tuple2<Float, Integer> tp, ScoreDetail scoreDetail) throws Exception {
return new Tuple2<>(tp._1 + scoreDetail.score, tp._2 + 1);
}
};
Function2<Tuple2<Float, Integer>, Tuple2<Float, Integer>, Tuple2<Float, Integer>> mergeCombiners = new Function2<Tuple2<Float, Integer>, Tuple2<Float, Integer>, Tuple2<Float, Integer>>() {
@Override
public Tuple2<Float, Integer> call(Tuple2<Float, Integer> tp1, Tuple2<Float, Integer> tp2) throws Exception {
return new Tuple2<>(tp1._1 + tp2._1, tp1._2 + tp2._2);
}
};
JavaPairRDD<String, Tuple2<Float,Integer>> combineByRDD = pairRDD.combineByKey(createCombine,mergeValue,mergeCombiners);
来源:CSDN
作者:夜雨听蝉鸣
链接:https://blog.csdn.net/a013399445/article/details/104740398