I want to take a stream of strings and turn it into a stream of word pairs. eg:
I have: { \"A\", \"Apple\", \"B\", \"Banana\", \"C\", \"Carrot\" }
This should do what you want, based on @njzk2's comment of using the stream twice, skipping the first element in the second case. It uses the zip
method that you link in your original question.
public static void main(String[] args) {
List<String> input = Arrays.asList("A", "Apple", "B", "Banana", "C", "Carrot");
List<List<String>> paired = zip(input.stream(),
input.stream().skip(1),
(a, b) -> Arrays.asList(a, b))
.collect(ArrayList::new, ArrayList::add, ArrayList::addAll);
System.out.println(paired);
}
This outputs a List<List<String>>
with contents:
[[A, Apple], [Apple, B], [B, Banana], [Banana, C], [C, Carrot]]
In the comments, you asked how to do this if you already have a Stream
. Unfortunately, it's difficult, because Streams
are not stateful, and there isn't really a concept of the "adjacent" element in the Stream
. There is a good discussion on this here.
I can think of two ways to do it, but I don't think you're going to like either of them:
Stream
to a List
, and then do my solution above. Ugly, but works as long as the Stream
isn't infinite and performance doesn't matter very much.StreamEx
and not a Stream
, and willing to add a dependency on a third party library.Also relevant to this discussion is this question here: Can I duplicate a Stream in Java 8?; it's not good news for your problem, but is worth reading and may have a solution that's more appealing to you.
You can use my StreamEx library which enhances standard Stream API. There is a method pairMap which does exactly what you need:
StreamEx.of("A", "Apple", "B", "Banana", "C", "Carrot")
.pairMap((a, b) -> a+","+b)
.forEach(System.out::println);
Output:
A,Apple
Apple,B
B,Banana
Banana,C
C,Carrot
The pairMap
argument is the function which converts the pair of adjacent elements to something which is suitable to your needs. If you have a Pair
class in your project, you can use .pairMap(Pair::new)
to get the stream of pairs. If you want to create a stream of two-element lists, you can use:
List<List<String>> list = StreamEx.of("A", "Apple", "B", "Banana", "C", "Carrot")
.pairMap((a, b) -> StreamEx.of(a, b).toList())
.toList();
System.out.println(list); // [[A, Apple], [Apple, B], [B, Banana], [Banana, C], [C, Carrot]]
This works with any element source (you can use StreamEx.of(collection)
, StreamEx.of(stream)
and so on), correctly works if you have more stream operations before pairMap
and very friendly to parallel processing (unlike solutions which involve stream zipping).
In case if your input is a List
with fast random access and you actually want List<List<String>>
as a result, there's a shorter and somewhat different way to achieve this in my library using ofSubLists:
List<String> input = Arrays.asList("A", "Apple", "B", "Banana", "C", "Carrot");
List<List<String>> list = StreamEx.ofSubLists(input, 2, 1).toList();
System.out.println(list); // [[A, Apple], [Apple, B], [B, Banana], [Banana, C], [C, Carrot]]
Here behind the scenes input.subList(i, i+2)
is called for each input list position, so your data is not copied to the new lists, but subLists are created which refer to the original list.
Here's a minimal amount of code that creates a List<List<String>>
of the pairs:
List<List<String>> pairs = new LinkedList<>();
testing.reduce((a, b)-> {pairs.add(Arrays.asList(a,b)); return b;});
If you:
Then you can create a method to group elements from a stream using Java 8 low-level stream builders StreamSupport and Spliterator:
class StreamUtils {
public static<T> Stream<List<T>> sliding(int size, Stream<T> stream) {
return sliding(size, 1, stream);
}
public static<T> Stream<List<T>> sliding(int size, int step, Stream<T> stream) {
Spliterator<T> spliterator = stream.spliterator();
long estimateSize;
if (!spliterator.hasCharacteristics(Spliterator.SIZED)) {
estimateSize = Long.MAX_VALUE;
} else if (size > spliterator.estimateSize()) {
estimateSize = 0;
} else {
estimateSize = (spliterator.estimateSize() - size) / step + 1;
}
return StreamSupport.stream(
new Spliterators.AbstractSpliterator<List<T>>(estimateSize, spliterator.characteristics()) {
List<T> buffer = new ArrayList<>(size);
@Override
public boolean tryAdvance(Consumer<? super List<T>> consumer) {
while (buffer.size() < size && spliterator.tryAdvance(buffer::add)) {
// Nothing to do
}
if (buffer.size() == size) {
List<T> keep = new ArrayList<>(buffer.subList(step, size));
consumer.accept(buffer);
buffer = keep;
return true;
}
return false;
}
}, stream.isParallel());
}
}
Methods and parameters naming was inspired in their Scala counterparts.
Let's test it:
Stream<String> testing = Stream.of("A", "Apple", "B", "Banana", "C", "Carrot");
System.out.println(StreamUtils.sliding(2, testing).collect(Collectors.toList()));
[[A, Apple], [Apple, B], [B, Banana], [Banana, C], [C, Carrot]]
What about not repeating elements:
Stream<String> testing = Stream.of("A", "Apple", "B", "Banana", "C", "Carrot");
System.out.println(StreamUtils.sliding(2, 2, testing).collect(Collectors.toList()));
[[A, Apple], [B, Banana], [C, Carrot]]
And now with an infinite Stream
:
StreamUtils.sliding(5, Stream.iterate(0, n -> n + 1))
.limit(5)
.forEach(System.out::println);
[0, 1, 2, 3, 4]
[1, 2, 3, 4, 5]
[2, 3, 4, 5, 6]
[3, 4, 5, 6, 7]
[4, 5, 6, 7, 8]