Custom thread pool in Java 8 parallel stream

后端 未结 15 910
旧巷少年郎
旧巷少年郎 2020-11-22 00:15

Is it possible to specify a custom thread pool for Java 8 parallel stream? I can not find it anywhere.

Imagine that I have a server application and I would like to

相关标签:
15条回答
  • 2020-11-22 01:03

    If you don't want to rely on implementation hacks, there's always a way to achieve the same by implementing custom collectors that will combine map and collect semantics... and you wouldn't be limited to ForkJoinPool:

    list.stream()
      .collect(parallelToList(i -> fetchFromDb(i), executor))
      .join()
    

    Luckily, it's done already here and available on Maven Central: http://github.com/pivovarit/parallel-collectors

    Disclaimer: I wrote it and take responsibility for it.

    0 讨论(0)
  • 2020-11-22 01:04

    Go to get AbacusUtil. Thread number can by specified for parallel stream. Here is the sample code:

    LongStream.range(4, 1_000_000).parallel(threadNum)...
    

    Disclosure: I'm the developer of AbacusUtil.

    0 讨论(0)
  • 2020-11-22 01:10

    To measure the actual number of used threads, you can check Thread.activeCount():

        Runnable r = () -> IntStream
                .range(-42, +42)
                .parallel()
                .map(i -> Thread.activeCount())
                .max()
                .ifPresent(System.out::println);
    
        ForkJoinPool.commonPool().submit(r).join();
        new ForkJoinPool(42).submit(r).join();
    

    This can produce on a 4-core CPU an output like:

    5 // common pool
    23 // custom pool
    

    Without .parallel() it gives:

    3 // common pool
    4 // custom pool
    
    0 讨论(0)
  • 2020-11-22 01:11

    The original solution (setting the ForkJoinPool common parallelism property) no longer works. Looking at the links in the original answer, an update which breaks this has been back ported to Java 8. As mentioned in the linked threads, this solution was not guaranteed to work forever. Based on that, the solution is the forkjoinpool.submit with .get solution discussed in the accepted answer. I think the backport fixes the unreliability of this solution also.

    ForkJoinPool fjpool = new ForkJoinPool(10);
    System.out.println("stream.parallel");
    IntStream range = IntStream.range(0, 20);
    fjpool.submit(() -> range.parallel()
            .forEach((int theInt) ->
            {
                try { Thread.sleep(100); } catch (Exception ignore) {}
                System.out.println(Thread.currentThread().getName() + " -- " + theInt);
            })).get();
    System.out.println("list.parallelStream");
    int [] array = IntStream.range(0, 20).toArray();
    List<Integer> list = new ArrayList<>();
    for (int theInt: array)
    {
        list.add(theInt);
    }
    fjpool.submit(() -> list.parallelStream()
            .forEach((theInt) ->
            {
                try { Thread.sleep(100); } catch (Exception ignore) {}
                System.out.println(Thread.currentThread().getName() + " -- " + theInt);
            })).get();
    
    0 讨论(0)
  • 2020-11-22 01:11

    If you don't need a custom ThreadPool but you rather want to limit the number of concurrent tasks, you can use:

    List<Path> paths = List.of("/path/file1.csv", "/path/file2.csv", "/path/file3.csv").stream().map(e -> Paths.get(e)).collect(toList());
    List<List<Path>> partitions = Lists.partition(paths, 4); // Guava method
    
    partitions.forEach(group -> group.parallelStream().forEach(csvFilePath -> {
           // do your processing   
    }));
    

    (Duplicate question asking for this is locked, so please bear me here)

    0 讨论(0)
  • 2020-11-22 01:13

    you can try implementing this ForkJoinWorkerThreadFactory and inject it to Fork-Join class.

    public ForkJoinPool(int parallelism,
                            ForkJoinWorkerThreadFactory factory,
                            UncaughtExceptionHandler handler,
                            boolean asyncMode) {
            this(checkParallelism(parallelism),
                 checkFactory(factory),
                 handler,
                 asyncMode ? FIFO_QUEUE : LIFO_QUEUE,
                 "ForkJoinPool-" + nextPoolId() + "-worker-");
            checkPermission();
        }
    

    you can use this constructor of Fork-Join pool to do this.

    notes:-- 1. if you use this, take into consideration that based on your implementation of new threads, scheduling from JVM will be affected, which generally schedules fork-join threads to different cores(treated as a computational thread). 2. task scheduling by fork-join to threads won't get affected. 3. Haven't really figured out how parallel stream is picking threads from fork-join(couldn't find proper documentation on it), so try using a different threadNaming factory so as to make sure, if threads in parallel stream are being picked from customThreadFactory that you provide. 4. commonThreadPool won't use this customThreadFactory.

    0 讨论(0)
提交回复
热议问题