I have a large file that contains a list of items.
I would like to create a batch of items, make an HTTP request with this batch (all of the items are needed as par
For completeness, here is a Guava solution.
Iterators.partition(stream.iterator(), batchSize).forEachRemaining(this::process);
In the question the collection is available so a stream isn't needed and it can be written as,
Iterables.partition(data, batchSize).forEach(this::process);
You could also take a look at cyclops-react, I am the author of this library. It implements the jOOλ interface (and by extension JDK 8 Streams), but unlike JDK 8 Parallel Streams it has a focus on Asynchronous operations (such as potentially blocking Async I/O calls). JDK Parallel Streams, by contrast focus on data parallelism for CPU bound operations. It works by managing aggregates of Future based tasks under the hood, but presents a standard extended Stream API to end users.
This sample code may help you get started
LazyFutureStream.parallelCommonBuilder()
.react(data)
.grouped(BATCH_SIZE)
.map(this::process)
.run();
There is a tutorial on batching here
And a more general Tutorial here
To use your own Thread Pool (which is probably more appropriate for blocking I/O), you could start processing with
LazyReact reactor = new LazyReact(40);
reactor.react(data)
.grouped(BATCH_SIZE)
.map(this::process)
.run();
Pure Java 8 example that works with parallel streams as well.
How to use:
Stream<Integer> integerStream = IntStream.range(0, 45).parallel().boxed();
CsStreamUtil.processInBatch(integerStream, 10, batch -> System.out.println("Batch: " + batch));
The method declaration and implementation:
public static <ElementType> void processInBatch(Stream<ElementType> stream, int batchSize, Consumer<Collection<ElementType>> batchProcessor)
{
List<ElementType> newBatch = new ArrayList<>(batchSize);
stream.forEach(element -> {
List<ElementType> fullBatch;
synchronized (newBatch)
{
if (newBatch.size() < batchSize)
{
newBatch.add(element);
return;
}
else
{
fullBatch = new ArrayList<>(newBatch);
newBatch.clear();
newBatch.add(element);
}
}
batchProcessor.accept(fullBatch);
});
if (newBatch.size() > 0)
batchProcessor.accept(new ArrayList<>(newBatch));
}