Are there equivalents to slice::chunks/windows for iterators to loop over pairs, triplets etc?

后端 未结 2 1786
遇见更好的自我
遇见更好的自我 2020-11-29 09:51

It can be useful to iterate over multiple variables at once, overlapping (slice::windows), or not (slice::chunks).

This only works for slices; is it possible to do t

相关标签:
2条回答
  • 2020-11-29 10:30

    TL;DR: The best way to have chunks and windows on an arbitrary iterator/collection is to first collect it into a Vec and iterate over that.


    The exact syntax requested is impossible in Rust.

    The issue is that in Rust, a function's signature is depending on types, not values, and while Dependent Typing exists, there are few languages that implement it (it's hard).

    This is why chunks and windows return a sub-slice by the way; the number of elements in a &[T] is not part of the type and therefore can be decided at run-time.


    Let's pretend you asked for: for slice in some_iter.windows(2) instead then.

    Where would the storage backing this slice live?

    It cannot live:

    • in the original collection because a LinkedList doesn't have a contiguous storage
    • in the iterator because of the definition of Iterator::Item, there is no lifetime available

    So, unfortunately, slices can only be used when the backing storage is a slice.


    If dynamic allocations are accepted, then it is possible to use Vec<Iterator::Item> as the Item of the chunking iterator.

    struct Chunks<I: Iterator> {
        elements: Vec<<I as Iterator>::Item>,
        underlying: I,
    }
    
    impl<I: Iterator> Chunks<I> {
        fn new(iterator: I, size: usize) -> Chunks<I> {
            assert!(size > 0);
    
            let mut result = Chunks {
               underlying: iterator, elements: Vec::with_capacity(size)
            };
            result.refill(size);
            result
        }
    
        fn refill(&mut self, size: usize) {
            assert!(self.elements.is_empty());
    
            for _ in 0..size {
                match self.underlying.next() {
                    Some(item) => self.elements.push(item),
                    None => break,
                }
            }
        }
    }
    
    impl<I: Iterator> Iterator for Chunks<I> {
        type Item = Vec<<I as Iterator>::Item>;
    
        fn next(&mut self) -> Option<Self::Item> {
            if self.elements.is_empty() {
                return None;
            }
    
            let new_elements = Vec::with_capacity(self.elements.len());
            let result = std::mem::replace(&mut self.elements, new_elements);
    
            self.refill(result.len());
    
            Some(result)
        }
    }
    
    fn main() {
        let v = vec!(1, 2, 3, 4, 5);
    
        for slice in Chunks::new(v.iter(), 2) {
            println!("{:?}", slice);
        }
    }
    

    Will return:

    [1, 2]
    [3, 4]
    [5]
    

    The canny reader will realize that I surreptitiously switched from windows to chunks.

    windows is more difficult, because it returns the same element multiple times which require that the element be Clone. Also, since it needs returning a full Vec each time, it will need internally to keep a Vec<Vec<Iterator::Item>>.

    This is left as an exercise to the reader.


    Finally, a note on performance: all those allocations are gonna hurt (especially in the windows case).

    The best allocation strategy is generally to allocate a single chunk of memory and then live off that (unless the amount is really massive, in which case streaming is required).

    It's called collect::<Vec<_>>() in Rust.

    And since the Vec has a chunks and windows methods (by virtue of implementing Deref<Target=[T]>), you can then use that instead:

    for slice in v.iter().collect::<Vec<_>>().chunks(2) {
        println!("{:?}", slice);
    }
    
    for slice in v.iter().collect::<Vec<_>>().windows(2) {
        println!("{:?}", slice);
    }
    

    Sometimes the best solutions are the simplest.

    0 讨论(0)
  • 2020-11-29 10:46

    It's possible to take chunks of an iterator using Itertools::tuples, up to a 4-tuple:

    use itertools::Itertools; // 0.9.0
    
    fn main() {
        let some_iter = vec![1, 2, 3, 4, 5, 6].into_iter();
    
        for (prev, next) in some_iter.tuples() {
            println!("{}--{}", prev, next);
        }
    }
    

    (playground)

    1--2
    3--4
    5--6
    

    If you don't know that your iterator exactly fits into the chunks, you can use Tuples::into_buffer to access any leftovers:

    use itertools::Itertools; // 0.9.0
    
    fn main() {
        let some_iter = vec![1, 2, 3, 4, 5].into_iter();
    
        let mut t = some_iter.tuples();
        for (prev, next) in t.by_ref() {
            println!("{}--{}", prev, next);
        }
        for leftover in t.into_buffer() {
            println!("{}", leftover);
        }
    }
    

    (playground)

    1--2
    3--4
    5
    

    It's also possible to take up to 4-tuple windows with Itertools::tuple_windows:

    use itertools::Itertools; // 0.9.0
    
    fn main() {
        let some_iter = vec![1, 2, 3, 4, 5, 6].into_iter();
    
        for (prev, next) in some_iter.tuple_windows() {
            println!("{}--{}", prev, next);
        }
    }
    

    (playground)

    1--2
    2--3
    3--4
    4--5
    5--6
    

    If you need to get partial chunks / windows, you can get

    0 讨论(0)
提交回复
热议问题