Flattening a collection

前端 未结 9 2012
走了就别回头了
走了就别回头了 2020-11-30 02:55

Say I have a Map>

I can get the values of the map easily enough, and iterate over it to produce a single

相关标签:
9条回答
  • 2020-11-30 03:36

    No, there is no shorter method. You have to use a loop.

    Update Apr 2014: Java 8 has finally come out. In the new version you can use the Iterable.forEach method to walk over a collection without using an explicit loop.

    Update Nov 2017: Found this question by chance when looking for a modern solution. Ended up going with reduce:

    someMap.values().stream().reduce(new ArrayList(), (accum, list) -> {
        accum.addAll(list);
        return accum;
    }):
    

    This avoids depending on mutable external state of forEach(someList::addAll) the overhead of flatMap(List::stream).

    0 讨论(0)
  • 2020-11-30 03:37

    Suggested by a colleague:

    listOfLists.stream().flatMap(e -> e.stream()).collect(Lists.toList())
    

    I like it better than forEach().

    0 讨论(0)
  • 2020-11-30 03:39

    Using Java 8 and if you prefer not to instantiate a List instance by yourself, like in the suggested (and accepted) solution

    someMap.values().forEach(someList::addAll);
    

    You could do it all by streaming with this statement:

    List<String> someList = map.values().stream().flatMap(c -> c.stream()).collect(Collectors.toList());
    

    By the way it should be interesting to know, that on Java 8 the accepted version seems to be indeed the fastest. It has about the same timing as a

    for (List<String> item : someMap.values()) ...
    

    and is a way faster than the pure streaming solution. Here is my little testcode. I explicitly don't name it benchmark to avoid the resulting discussion of benchmark flaws. ;) I do every test twice to hopefully get a full compiled version.

        Map<String, List<String>> map = new HashMap<>();
        long millis;
    
        map.put("test", Arrays.asList("1", "2", "3", "4"));
        map.put("test2", Arrays.asList("10", "20", "30", "40"));
        map.put("test3", Arrays.asList("100", "200", "300", "400"));
    
        int maxcounter = 1000000;
        
        System.out.println("1 stream flatmap");
        millis = System.currentTimeMillis();
        for (int i = 0; i < maxcounter; i++) {
            List<String> someList = map.values().stream().flatMap(c -> c.stream()).collect(Collectors.toList());
        }
        System.out.println(System.currentTimeMillis() - millis);
        
        System.out.println("1 parallel stream flatmap");
        millis = System.currentTimeMillis();
        for (int i = 0; i < maxcounter; i++) {
            List<String> someList = map.values().parallelStream().flatMap(c -> c.stream()).collect(Collectors.toList());
        }
        System.out.println(System.currentTimeMillis() - millis);
    
        System.out.println("1 foreach");
        millis = System.currentTimeMillis();
        for (int i = 0; i < maxcounter; i++) {
            List<String> mylist = new ArrayList<String>();
            map.values().forEach(mylist::addAll);
        }
        System.out.println(System.currentTimeMillis() - millis);        
    
        System.out.println("1 for");
        millis = System.currentTimeMillis();
        for (int i = 0; i < maxcounter; i++) {
            List<String> mylist = new ArrayList<String>();
            for (List<String> item : map.values()) {
                mylist.addAll(item);
            }
        }
        System.out.println(System.currentTimeMillis() - millis);
        
        
        System.out.println("2 stream flatmap");
        millis = System.currentTimeMillis();
        for (int i = 0; i < maxcounter; i++) {
            List<String> someList = map.values().stream().flatMap(c -> c.stream()).collect(Collectors.toList());
        }
        System.out.println(System.currentTimeMillis() - millis);
        
        System.out.println("2 parallel stream flatmap");
        millis = System.currentTimeMillis();
        for (int i = 0; i < maxcounter; i++) {
            List<String> someList = map.values().parallelStream().flatMap(c -> c.stream()).collect(Collectors.toList());
        }
        System.out.println(System.currentTimeMillis() - millis);
        
        System.out.println("2 foreach");
        millis = System.currentTimeMillis();
        for (int i = 0; i < maxcounter; i++) {
            List<String> mylist = new ArrayList<String>();
            map.values().forEach(mylist::addAll);
        }
        System.out.println(System.currentTimeMillis() - millis);        
    
        System.out.println("2 for");
        millis = System.currentTimeMillis();
        for (int i = 0; i < maxcounter; i++) {
            List<String> mylist = new ArrayList<String>();
            for (List<String> item : map.values()) {
                mylist.addAll(item);
            }
        }
        System.out.println(System.currentTimeMillis() - millis);
    

    And here are the results:

    1 stream flatmap
    468
    1 parallel stream flatmap
    1529
    1 foreach
    140
    1 for
    172
    2 stream flatmap
    296
    2 parallel stream flatmap
    1482
    2 foreach
    156
    2 for
    141
    

    Edit 2016-05-24 (two years after):

    Running the same test using an actual Java 8 version (U92) on the same machine:

    1 stream flatmap
    313
    1 parallel stream flatmap
    3257
    1 foreach
    109
    1 for
    141
    2 stream flatmap
    219
    2 parallel stream flatmap
    3830
    2 foreach
    125
    2 for
    140
    

    It seems that there is a speedup for sequential processing of streams and an even larger overhead for parallel streams.

    Edit 2018-10-18 (four years after):

    Using now Java 10 version (10.0.2) on the same machine:

    1 stream flatmap
    393
    1 parallel stream flatmap
    3683
    1 foreach
    157
    1 for
    175
    2 stream flatmap
    243
    2 parallel stream flatmap
    5945
    2 foreach
    128
    2 for
    187
    

    The overhead for parallel streaming seems to be larger.

    Edit 2020-05-22 (six years after):

    Using now Java 14 version (14.0.0.36) on a different machine:

    1 stream flatmap
    299
    1 parallel stream flatmap
    3209
    1 foreach
    202
    1 for
    170
    2 stream flatmap
    178
    2 parallel stream flatmap
    3270
    2 foreach
    138
    2 for
    167
    

    It should really be noted, that this was done on a different machine (but I think comparable). The parallel streaming overhead seems to be considerably smaller than before.

    0 讨论(0)
提交回复
热议问题