Why does Iterable not provide stream() and parallelStream() methods?

前端 未结 3 510
北海茫月
北海茫月 2020-11-28 17:56

I am wondering why the Iterable interface does not provide the stream() and parallelStream() methods. Consider the following class:

相关标签:
3条回答
  • 2020-11-28 17:58

    If you know the size you could use java.util.Collection which provides the stream() method:

    public class Hand extends AbstractCollection<Card> {
       private final List<Card> list = new ArrayList<>();
       private final int capacity;
    
       //...
    
       @Override
       public Iterator<Card> iterator() {
           return list.iterator();
       }
    
       @Override
       public int size() {
          return list.size();
       }
    }
    

    And then:

    new Hand().stream().map(...)
    

    I faced the same problem and was surprised that my Iterable implementation could be very easily extended to an AbstractCollection implementation by simply adding the size() method (luckily I had the size of the collection :-)

    You should also consider to override Spliterator<E> spliterator().

    0 讨论(0)
  • 2020-11-28 18:18

    This was not an omission; there was detailed discussion on the EG list in June of 2013.

    The definitive discussion of the Expert Group is rooted at this thread.

    While it seemed "obvious" (even to the Expert Group, initially) that stream() seemed to make sense on Iterable, the fact that Iterable was so general became a problem, because the obvious signature:

    Stream<T> stream()
    

    was not always what you were going to want. Some things that were Iterable<Integer> would rather have their stream method return an IntStream, for example. But putting the stream() method this high up in the hierarchy would make that impossible. So instead, we made it really easy to make a Stream from an Iterable, by providing a spliterator() method. The implementation of stream() in Collection is just:

    default Stream<E> stream() {
        return StreamSupport.stream(spliterator(), false);
    }
    

    Any client can get the stream they want from an Iterable with:

    Stream s = StreamSupport.stream(iter.spliterator(), false);
    

    In the end we concluded that adding stream() to Iterable would be a mistake.

    0 讨论(0)
  • 2020-11-28 18:23

    I did an investigation in several of the project lambda mailing lists and I think I found a few interesting discussions.

    I have not found a satisfactory explanation so far. After reading all this I concluded it was just an omission. But you can see here that it was discussed several times over the years during the design of the API.

    Lambda Libs Spec Experts

    I found a discussion about this in the Lambda Libs Spec Experts mailing list:

    Under Iterable/Iterator.stream() Sam Pullara said:

    I was working with Brian on seeing how limit/substream functionality[1] might be implemented and he suggested conversion to Iterator was the right way to go about it. I had thought about that solution but didn't find any obvious way to take an iterator and turn it into a stream. It turns out it is in there, you just need to first convert the iterator to a spliterator and then convert the spliterator to a stream. So this brings me to revisit the whether we should have these hanging off one of Iterable/Iterator directly or both.

    My suggestion is to at least have it on Iterator so you can move cleanly between the two worlds and it would also be easily discoverable rather than having to do:

    Streams.stream(Spliterators.spliteratorUnknownSize(iterator, Spliterator.ORDERED))

    And then Brian Goetz responded:

    I think Sam's point was that there are plenty of library classes that give you an Iterator but don't let you necessarily write your own spliterator. So all you can do is call stream(spliteratorUnknownSize(iterator)). Sam is suggesting that we define Iterator.stream() to do that for you.

    I would like to keep the stream() and spliterator() methods as being for library writers / advanced users.

    And later

    "Given that writing a Spliterator is easier than writing an Iterator, I would prefer to just write a Spliterator instead of an Iterator (Iterator is so 90s :)"

    You're missing the point, though. There are zillions of classes out there that already hand you an Iterator. And many of them are not spliterator-ready.

    Previous Discussions in Lambda Mailing List

    This may not be the answer you are looking for but in the Project Lambda mailing list this was briefly discussed. Perhaps this helps to foster a broader discussion on the subject.

    In the words of Brian Goetz under Streams from Iterable:

    Stepping back...

    There are lots of ways to create a Stream. The more information you have about how to describe the elements, the more functionality and performance the streams library can give you. In order of least to most information, they are:

    Iterator

    Iterator + size

    Spliterator

    Spliterator that knows its size

    Spliterator that knows its size, and further knows that all sub-splits know their size.

    (Some may be surprised to find that we can extract parallelism even from a dumb iterator in cases where Q (work per element) is nontrivial.)

    If Iterable had a stream() method, it would just wrap an Iterator with a Spliterator, with no size information. But, most things that are Iterable do have size information. Which means we're serving up deficient streams. That's not so good.

    One downside of the API practice outlined by Stephen here, of accepting Iterable instead of Collection, is that you are forcing things through a "small pipe" and therefore discarding size information when it might be useful. That's fine if all you're doing to do is forEach it, but if you want to do more, its better if you can preserve all the information you want.

    The default provided by Iterable would be a crappy one indeed -- it would discard size even though the vast majority of Iterables do know that information.

    Contradiction?

    Although, it looks like the discussion is based on the changes that the Expert Group did to the initial design of Streams which was initially based on iterators.

    Even so, it is interesting to notice that in a interface like Collection, the stream method is defined as:

    default Stream<E> stream() {
       return StreamSupport.stream(spliterator(), false);
    }
    

    Which could be the exact the same code being used in the Iterable interface.

    So, this is why I said this answer is probably not satisfactory, but still interesting for the discussion.

    Evidence of Refactoring

    Continuing with the analysis in the mailing list, it looks like the splitIterator method was originally in the Collection interface, and at some point in 2013 they moved it up to Iterable.

    Pull splitIterator up from Collection to Iterable.

    Conclusion/Theories?

    Then chances are that the lack of the method in Iterable is just an omission, since it looks like they should have moved the stream method as well when they moved the splitIterator up from Collection to Iterable.

    If there are other reasons those are not evident. Somebody else has other theories?

    0 讨论(0)
提交回复
热议问题