Java 8 Stream with batch processing

前端 未结 15 782
醉梦人生
醉梦人生 2020-11-28 02:42

I have a large file that contains a list of items.

I would like to create a batch of items, make an HTTP request with this batch (all of the items are needed as par

相关标签:
15条回答
  • 2020-11-28 03:30

    For completeness, here is a Guava solution.

    Iterators.partition(stream.iterator(), batchSize).forEachRemaining(this::process);
    

    In the question the collection is available so a stream isn't needed and it can be written as,

    Iterables.partition(data, batchSize).forEach(this::process);
    
    0 讨论(0)
  • 2020-11-28 03:31

    You could also take a look at cyclops-react, I am the author of this library. It implements the jOOλ interface (and by extension JDK 8 Streams), but unlike JDK 8 Parallel Streams it has a focus on Asynchronous operations (such as potentially blocking Async I/O calls). JDK Parallel Streams, by contrast focus on data parallelism for CPU bound operations. It works by managing aggregates of Future based tasks under the hood, but presents a standard extended Stream API to end users.

    This sample code may help you get started

    LazyFutureStream.parallelCommonBuilder()
                    .react(data)
                    .grouped(BATCH_SIZE)                  
                    .map(this::process)
                    .run();
    

    There is a tutorial on batching here

    And a more general Tutorial here

    To use your own Thread Pool (which is probably more appropriate for blocking I/O), you could start processing with

         LazyReact reactor = new LazyReact(40);
    
         reactor.react(data)
                .grouped(BATCH_SIZE)                  
                .map(this::process)
                .run();
    
    0 讨论(0)
  • 2020-11-28 03:31

    Pure Java 8 example that works with parallel streams as well.

    How to use:

    Stream<Integer> integerStream = IntStream.range(0, 45).parallel().boxed();
    CsStreamUtil.processInBatch(integerStream, 10, batch -> System.out.println("Batch: " + batch));
    

    The method declaration and implementation:

    public static <ElementType> void processInBatch(Stream<ElementType> stream, int batchSize, Consumer<Collection<ElementType>> batchProcessor)
    {
        List<ElementType> newBatch = new ArrayList<>(batchSize);
    
        stream.forEach(element -> {
            List<ElementType> fullBatch;
    
            synchronized (newBatch)
            {
                if (newBatch.size() < batchSize)
                {
                    newBatch.add(element);
                    return;
                }
                else
                {
                    fullBatch = new ArrayList<>(newBatch);
                    newBatch.clear();
                    newBatch.add(element);
                }
            }
    
            batchProcessor.accept(fullBatch);
        });
    
        if (newBatch.size() > 0)
            batchProcessor.accept(new ArrayList<>(newBatch));
    }
    
    0 讨论(0)
提交回复
热议问题