Are there equivalents to slice::chunks/windows for iterators to loop over pairs, triplets etc?

20,137

Solution 1

Since Rust 1.51 this is possible with const generics where the iterator yields constant size arrays [T; N] for any N.

I built the little itermore crate which does this. It provides .chunks() and .windows() methods for any iterator.

for [a, b, c] in some_iter.chunks() {
    ...
}
for [prev, next] in some_iter.windows() {
    ...
}

Using the example given in the Itertools answer:

use itermore::IterMore;

fn main() {
    let some_iter = vec![1, 2, 3, 4, 5, 6].into_iter();

    for [prev, next] in some_iter.chunks() {
        println!("{}--{}", prev, next);
    }
}

This outputs

1--2
3--4
5--6

Most times the array size can be inferred but you can also specific it explicitly. Additionally, any reasonable size N can be used, there is no limit like in the Itertools case.

use itermore::IterMore;

fn main() {
    let mut iter = vec![1, 2, 3, 4, 5, 6].into_iter().windows::<5>();
    println!("{:?}", iter.next());
    println!("{:?}", iter.next());
    println!("{:?}", iter.next());
}

This outputs

Some([1, 2, 3, 4, 5])
Some([2, 3, 4, 5, 6])
None

Note: .windows() uses clone to yield elements multiple times so its best used for references and cheap to copy types.

Solution 2

It's possible to take chunks of an iterator using Itertools::tuples, up to a 4-tuple:

use itertools::Itertools; // 0.9.0

fn main() {
    let some_iter = vec![1, 2, 3, 4, 5, 6].into_iter();

    for (prev, next) in some_iter.tuples() {
        println!("{}--{}", prev, next);
    }
}

(playground)

1--2
3--4
5--6

If you don't know that your iterator exactly fits into the chunks, you can use Tuples::into_buffer to access any leftovers:

use itertools::Itertools; // 0.9.0

fn main() {
    let some_iter = vec![1, 2, 3, 4, 5].into_iter();

    let mut t = some_iter.tuples();
    for (prev, next) in t.by_ref() {
        println!("{}--{}", prev, next);
    }
    for leftover in t.into_buffer() {
        println!("{}", leftover);
    }
}

(playground)

1--2
3--4
5

It's also possible to take up to 4-tuple windows with Itertools::tuple_windows:

use itertools::Itertools; // 0.9.0

fn main() {
    let some_iter = vec![1, 2, 3, 4, 5, 6].into_iter();

    for (prev, next) in some_iter.tuple_windows() {
        println!("{}--{}", prev, next);
    }
}

(playground)

1--2
2--3
3--4
4--5
5--6

If you need to get partial chunks / windows, you can get

Solution 3

TL;DR: The best way to have chunks and windows on an arbitrary iterator/collection is to first collect it into a Vec and iterate over that.


The exact syntax requested is impossible in Rust.

The issue is that in Rust, a function's signature is depending on types, not values, and while Dependent Typing exists, there are few languages that implement it (it's hard).

This is why chunks and windows return a sub-slice by the way; the number of elements in a &[T] is not part of the type and therefore can be decided at run-time.


Let's pretend you asked for: for slice in some_iter.windows(2) instead then.

Where would the storage backing this slice live?

It cannot live:

  • in the original collection because a LinkedList doesn't have a contiguous storage
  • in the iterator because of the definition of Iterator::Item, there is no lifetime available

So, unfortunately, slices can only be used when the backing storage is a slice.


If dynamic allocations are accepted, then it is possible to use Vec<Iterator::Item> as the Item of the chunking iterator.

struct Chunks<I: Iterator> {
    elements: Vec<<I as Iterator>::Item>,
    underlying: I,
}

impl<I: Iterator> Chunks<I> {
    fn new(iterator: I, size: usize) -> Chunks<I> {
        assert!(size > 0);

        let mut result = Chunks {
           underlying: iterator, elements: Vec::with_capacity(size)
        };
        result.refill(size);
        result
    }

    fn refill(&mut self, size: usize) {
        assert!(self.elements.is_empty());

        for _ in 0..size {
            match self.underlying.next() {
                Some(item) => self.elements.push(item),
                None => break,
            }
        }
    }
}

impl<I: Iterator> Iterator for Chunks<I> {
    type Item = Vec<<I as Iterator>::Item>;

    fn next(&mut self) -> Option<Self::Item> {
        if self.elements.is_empty() {
            return None;
        }

        let new_elements = Vec::with_capacity(self.elements.len());
        let result = std::mem::replace(&mut self.elements, new_elements);

        self.refill(result.len());

        Some(result)
    }
}

fn main() {
    let v = vec!(1, 2, 3, 4, 5);

    for slice in Chunks::new(v.iter(), 2) {
        println!("{:?}", slice);
    }
}

Will return:

[1, 2]
[3, 4]
[5]

The canny reader will realize that I surreptitiously switched from windows to chunks.

windows is more difficult, because it returns the same element multiple times which require that the element be Clone. Also, since it needs returning a full Vec each time, it will need internally to keep a Vec<Vec<Iterator::Item>>.

This is left as an exercise to the reader.


Finally, a note on performance: all those allocations are gonna hurt (especially in the windows case).

The best allocation strategy is generally to allocate a single chunk of memory and then live off that (unless the amount is really massive, in which case streaming is required).

It's called collect::<Vec<_>>() in Rust.

And since the Vec has a chunks and windows methods (by virtue of implementing Deref<Target=[T]>), you can then use that instead:

for slice in v.iter().collect::<Vec<_>>().chunks(2) {
    println!("{:?}", slice);
}

for slice in v.iter().collect::<Vec<_>>().windows(2) {
    println!("{:?}", slice);
}

Sometimes the best solutions are the simplest.

Share:
20,137
ideasman42
Author by

ideasman42

Updated on November 09, 2021

Comments

  • ideasman42
    ideasman42 over 2 years

    It can be useful to iterate over multiple variables at once, overlapping (slice::windows), or not (slice::chunks).

    This only works for slices; is it possible to do this for iterators, using tuples for convenience?

    Something like the following could be written:

    for (prev, next) in some_iter.windows(2) {
        ...
    }
    

    If not, could it be implemented as a trait on existing iterators?