Java 8 - Best way to transform a list: map or foreach?

前端 未结 8 1827
隐瞒了意图╮
隐瞒了意图╮ 2020-12-07 07:01

I have a list myListToParse where I want to filter the elements and apply a method on each element, and add the result in another list myFinalList.

相关标签:
8条回答
  • 2020-12-07 07:56

    May be Method 3.

    I always prefer to keep logic separate.

    Predicate<Long> greaterThan100 = new Predicate<Long>() {
                @Override
                public boolean test(Long currentParameter) {
                    return currentParameter > 100;
                }
            };
    
            List<Long> sourceLongList = Arrays.asList(1L, 10L, 50L, 80L, 100L, 120L, 133L, 333L);
            List<Long> resultList = sourceLongList.parallelStream().filter(greaterThan100).collect(Collectors.toList());
    
    0 讨论(0)
  • 2020-12-07 08:05

    There is a third option - using stream().toArray() - see comments under why didn't stream have a toList method. It turns out to be slower than forEach() or collect(), and less expressive. It might be optimised in later JDK builds, so adding it here just in case.

    assuming List<String>

        myFinalList = Arrays.asList(
                myListToParse.stream()
                        .filter(Objects::nonNull)
                        .map(this::doSomething)
                        .toArray(String[]::new)
        );
    

    with a micro-micro benchmark, 1M entries, 20% nulls and simple transform in doSomething()

    private LongSummaryStatistics benchmark(final String testName, final Runnable methodToTest, int samples) {
        long[] timing = new long[samples];
        for (int i = 0; i < samples; i++) {
            long start = System.currentTimeMillis();
            methodToTest.run();
            timing[i] = System.currentTimeMillis() - start;
        }
        final LongSummaryStatistics stats = Arrays.stream(timing).summaryStatistics();
        System.out.println(testName + ": " + stats);
        return stats;
    }
    

    the results are

    parallel:

    toArray: LongSummaryStatistics{count=10, sum=3721, min=321, average=372,100000, max=535}
    forEach: LongSummaryStatistics{count=10, sum=3502, min=249, average=350,200000, max=389}
    collect: LongSummaryStatistics{count=10, sum=3325, min=265, average=332,500000, max=368}
    

    sequential:

    toArray: LongSummaryStatistics{count=10, sum=5493, min=517, average=549,300000, max=569}
    forEach: LongSummaryStatistics{count=10, sum=5316, min=427, average=531,600000, max=571}
    collect: LongSummaryStatistics{count=10, sum=5380, min=444, average=538,000000, max=557}
    

    parallel without nulls and filter (so the stream is SIZED): toArrays has the best performance in such case, and .forEach() fails with "indexOutOfBounds" on the recepient ArrayList, had to replace with .forEachOrdered()

    toArray: LongSummaryStatistics{count=100, sum=75566, min=707, average=755,660000, max=1107}
    forEach: LongSummaryStatistics{count=100, sum=115802, min=992, average=1158,020000, max=1254}
    collect: LongSummaryStatistics{count=100, sum=88415, min=732, average=884,150000, max=1014}
    
    0 讨论(0)
提交回复
热议问题