I want to use a Java 8 Stream and Group by one classifier but have multiple Collector functions. So when grouping, for example the average and the sum of one field (or maybe
By using a map as an output type one could have a potentially infinite list of reducers each producing its own statistic and adding it to the map.
public static <K, V> Map<K, V> addMap(Map<K, V> map, K k, V v) {
Map<K, V> mapout = new HashMap<K, V>();
mapout.putAll(map);
mapout.put(k, v);
return mapout;
}
...
List<Person> persons = new ArrayList<>();
persons.add(new Person("Person One", 1, 18));
persons.add(new Person("Person Two", 1, 20));
persons.add(new Person("Person Three", 1, 30));
persons.add(new Person("Person Four", 2, 30));
persons.add(new Person("Person Five", 2, 29));
persons.add(new Person("Person Six", 3, 18));
List<BiFunction<Map<String, Integer>, Person, Map<String, Integer>>> listOfReducers = new ArrayList<>();
listOfReducers.add((m, p) -> addMap(m, "Count", Optional.ofNullable(m.get("Count")).orElse(0) + 1));
listOfReducers.add((m, p) -> addMap(m, "Sum", Optional.ofNullable(m.get("Sum")).orElse(0) + p.i1));
BiFunction<Map<String, Integer>, Person, Map<String, Integer>> applyList
= (mapin, p) -> {
Map<String, Integer> mapout = mapin;
for (BiFunction<Map<String, Integer>, Person, Map<String, Integer>> f : listOfReducers) {
mapout = f.apply(mapout, p);
}
return mapout;
};
BinaryOperator<Map<String, Integer>> combineMaps
= (map1, map2) -> {
Map<String, Integer> mapout = new HashMap<>();
mapout.putAll(map1);
mapout.putAll(map2);
return mapout;
};
Map<String, Integer> map
= persons
.stream()
.reduce(new HashMap<String, Integer>(),
applyList, combineMaps);
System.out.println("map = " + map);
Produces :
map = {Sum=10, Count=6}
Instead of chaining the collectors, you should build an abstraction which is an aggregator of collectors: implement the Collector
interface with a class which accepts a list of collectors and delegates each method invocation to each of them. Then, in the end, you return new Data()
with all the results the nested collectors produced.
You can avoid creating a custom class with all the method declarations by making use of Collector.of(supplier, accumulator, combiner, finisher, Collector.Characteristics... characteristics)
The finisher
lambda will call the finisher of each nested collector, then return the Data
instance.
You could chain them,
A collector can only produce one object, but this object can hold multiple values. You could return a Map for example where the map has an entry for each collector you are returning.
You can use Collectors.of(HashMap::new, accumulator, combiner);
Your accumulator
would have a Map of Collectors where the keys of the Map produced matches the name of the Collector. Te combiner would need a way to combine multiple result esp when this is performed in parallel.
Generally the built in collectors use a data type for complex results.
From Collectors
public static <T>
Collector<T, ?, DoubleSummaryStatistics> summarizingDouble(ToDoubleFunction<? super T> mapper) {
return new CollectorImpl<T, DoubleSummaryStatistics, DoubleSummaryStatistics>(
DoubleSummaryStatistics::new,
(r, t) -> r.accept(mapper.applyAsDouble(t)),
(l, r) -> { l.combine(r); return l; }, CH_ID);
}
and in its own class
public class DoubleSummaryStatistics implements DoubleConsumer {
private long count;
private double sum;
private double sumCompensation; // Low order bits of sum
private double simpleSum; // Used to compute right sum for non-finite inputs
private double min = Double.POSITIVE_INFINITY;
private double max = Double.NEGATIVE_INFINITY;
For the concrete problem of summing and averaging, use collectingAndThen along with summarizingDouble:
Map<Integer, Data> result = persons.stream().collect(
groupingBy(Person::getGroup,
collectingAndThen(summarizingDouble(Person::getAge),
dss -> new Data((long)dss.getAverage(), (long)dss.getSum()))));
For the more generic problem (collect various things about your Persons), you can create a complex collector like this:
// Individual collectors are defined here
List<Collector<Person, ?, ?>> collectors = Arrays.asList(
Collectors.averagingInt(Person::getAge),
Collectors.summingInt(Person::getAge));
@SuppressWarnings("unchecked")
Collector<Person, List<Object>, List<Object>> complexCollector = Collector.of(
() -> collectors.stream().map(Collector::supplier)
.map(Supplier::get).collect(toList()),
(list, e) -> IntStream.range(0, collectors.size()).forEach(
i -> ((BiConsumer<Object, Person>) collectors.get(i).accumulator()).accept(list.get(i), e)),
(l1, l2) -> {
IntStream.range(0, collectors.size()).forEach(
i -> l1.set(i, ((BinaryOperator<Object>) collectors.get(i).combiner()).apply(l1.get(i), l2.get(i))));
return l1;
},
list -> {
IntStream.range(0, collectors.size()).forEach(
i -> list.set(i, ((Function<Object, Object>)collectors.get(i).finisher()).apply(list.get(i))));
return list;
});
Map<Integer, List<Object>> result = persons.stream().collect(
groupingBy(Person::getGroup, complexCollector));
Map values are lists where first element is the result of applying the first collector and so on. You can add a custom finisher step using Collectors.collectingAndThen(complexCollector, list -> ...)
to convert this list to something more appropriate.