https://github.com/rustomax/rust-iterators
1、迭代器是什么?
迭代器(iterator)负责遍历序列中的每一项和决定序列何时结束的逻辑,迭代器是 惰性的(lazy)。迭代器模式允许你对一个项的序列进行某些处理。
let v = vec![1, 2, 3];
let v_iter = v.iter(); //实际上只是创建了一个迭代器,没有做其他更深层次的动作
迭代器使用样例:计算1到10的和
fn main() {
println!("{:?}", (1..10).sum::<i32>());
}
2、Iterator trait 和 IntoIterator trait
迭代器都实现了定义于标准库的Iterator trait
(std::iter::Iterator
),该trait要求实现其的类型要impl
关联类型 Item
与方法next
,具体可参见定义
pub trait Iterator {
/// The type of the elements being iterated over.
#[stable(feature = "rust1", since = "1.0.0")]
type Item;
#[stable(feature = "rust1", since = "1.0.0")]
fn next(&mut self) -> Option<Self::Item>;
///一些其他默认实现
}
Iterator
提供了丰富的API
及其默认实现,并且其中的默认实现大部分都依赖于next()
,而next
是 Iterator
实现者被要求定义的唯一方法。next
一次返回迭代器中的一个项,封装在 Some
中,当迭代器结束时,它返回 None
。
标准库中的IntoIterator trait
定义如下
#[rustc_diagnostic_item = "IntoIterator"]
#[stable(feature = "rust1", since = "1.0.0")]
pub trait IntoIterator {
#[stable(feature = "rust1", since = "1.0.0")]
type Item;
#[stable(feature = "rust1", since = "1.0.0")]
type IntoIter: Iterator<Item = Self::Item>;
#[stable(feature = "rust1", since = "1.0.0")]
fn into_iter(self) -> Self::IntoIter;
}
#[stable(feature = "rust1", since = "1.0.0")]
impl<I: Iterator> IntoIterator for I {
type Item = I::Item;
type IntoIter = I;
fn into_iter(self) -> I {
self
}
}
意味着那些实现Iterator trait
的类型,将自动实现IntoIterator trait
,于是可以调用into_iter()
方法,而这也是for
循环某些类型的工作基础,如下例子
fn main() {
let v = vec![1, 2, 3];
for i in v {
println!("{:?}", i);
}
}
reference说明:A for
expression is a syntactic construct for looping over elements provided by an implementation of std::iter::IntoIterator
。
Iterator
有丰富的API及其默认实现,具体可以参考标准库文档
Iterators
The power of for
Rust's for
loop is safer, more powerful, and more efficient than C's. For example, let's iterate over an array. To write the program in C style:
let v = vec![1, 2, 3];
for i in 0..v.len() {
println!("{}", v[i]);
}
The above is an anti-pattern in Rust. In idiomatic Rust, it becomes:
let v = vec![1, 2, 3];
for x in v {
println!("{:?}", x);
}
The advantages of the second, idiomatic version are:
-
correctness: since the code inside the
for
loop does not access the indexi
directly, it cannot access an element not inv
, e.g.,v[3]
. -
efficiency: since the code inside the
for
loop does not access the indexi
directly, the compiler need not generate code for verifying thati
is in the bound of the vector. -
flexibility: the order in which
for
iterates over the elements inv
is transparent to the code inside the loop, so the inside code works with any types of collections, be it a vector, tree, hash table, etc.
Create iterators
Iterator traits
Rust allows a program to iterate over any type T
in a for
loop, as long as T
implements the Iterator
trait.
trait Iterator {
type Item;
fn next(&mut self) -> Option<Self::Item>;
}
trait IntoIterator where Self::IntoIter::Item == Self::Item {
type Item;
type IntoIter: Iterator;
fn into_iter(self) -> Self::IntoIter;
}
impl<I> IntoIterator for I where I: Iterator
The above looks complicated, so let's dissect them one by one.
-
The first trait,
Iterator
, defines the interface for an iterator. When a caller invokesnext()
, the iterator returnsSome(_)
if it has more items, orNone
if it has exhausted all its items. -
The second trait,
IntoIterator
, defines the interface for creating anIterator
. -
The last line is interesting. It instructs Rust to automatically implement
IntoIterator
for any type that implementsIterator
.
Desugar for
loop
Rust's for
loop syntax is sugar for iterators. A program may use a for
loop on any type that implements IntoIterator
. The statement for i in v { statement }
desugars approximately to
let mut iter = IntoIterator::into_iter(v);
loop {
match iter.next() {
Some(x) => { statement },
None => break,
}
}
Create iterators
If you wish to allow your type T
to be usable in a for
loop, you have two options:
impl Iterator for T
, orimpl IntoIterator for T
Which one should you choose?
In the first option, you must implement the method next()
for T
, which requires T
to contain some state that tracks which elements have been returned and which element to return next.
In the second option, T
creates an iterator and returns it. It is the responsibility of this iterator to maintain some state to track which element to return next.
The following two programs contrast these two options.
struct Counter {
max: i32,
// `count` tracks the state of this iterator.
count: i32,
}
impl Counter {
fn new(max: i32) -> Counter {
Counter { count: -1, max: max }
}
}
impl Iterator for Counter {
type Item = i32;
fn next(&mut self) -> Option<Self::Item> {
self.count += 1;
if self.count < self.max {
Some(self.count)
} else {
None
}
}
}
for i in Counter::new(10) {
println!("{}", i);
}
struct Counter {
max: i32,
// No need to track the state, because this isn't an iterator.
}
impl Counter {
fn new(max: i32) -> Counter {
Counter { max: max }
}
}
impl IntoIterator for Counter {
type Item = i32;
type IntoIter = std::ops::Range<Self::Item>;
fn into_iter(self) -> Self::IntoIter {
std::ops::Range{ start: 0, end: self.max }
}
}
for i in Counter::new(10) {
println!("{}", i);
}
Range operators
Rust's standard library defines several range operators in std::ops
. For example,
struct Range<Idx> {
pub start: Idx,
pub end: Idx,
}
The compiler treats a..b
as a shorthand for Range{ start: a, end: b }
. For example, the following two lines are equivalent:
for i in 0..n { ... }
for i in std::ops::Range{ start: 0, end: n } { ... }
There are two common uses of iterators.
- Adapters create a new iterator based on a current iterator.
- Consumers consume an iterator to produce another value.
Adapters
An iterator adapter is a function that takes an iterator and returns another iterator. Common adapters are map()
, take()
, and filter()
.
trait Iterator {
type Item;
fn filter<P>(self, predicate: P) -> Filter<Self, P>
where P: FnMut(&Self::Item) -> bool;
fn map<B, F>(self, f: F) -> Map<Self, F>
where F: FnMut(Self::Item) -> B;
fn take(self, n: usize) -> Take<Self>;
}
map()
map()
maps each element in the current iterator to an element in the new iterator. It takes a closure as the parameter. The closure takes one element in the current iterator and returns a value that becomes an element in the new iterator.
for i in (0..10).map(|x| x * 2) {
println!("{}", i);
}
Iterators are lazy, in the sense that the next()
method is called only when the program iterates over the iterator. For example, the following program prints nothing.
(0..10).map(|x| println!("{}", x));
take()
take()
creates an iterator that contains the first n
elements of the current iterator.
for i in (0..10).take(5) {
println!("{}", i);
}
filter()
filter()
creates a new iterator that contains some elements in the current iterator. It takes one parameter, which is a closure that determines whether each element in the current iterator should be included in the new iterator.
for i in (0..10).filter(|x| x % 2 == 0) {
println!("{}", i);
}
Chaining
Since an adapter returns an iterator, a program can chain adapters.
for i in (0..10).map(|x| x * 2).take(5).filter(|x| x % 3 == 0) {
println!("{}", i);
}
Consumers
A consumer also takes an iterator, but instead of returning another iterator, it returns a value that is not an iterator.
Common consumers are find()
, collect()
, and fold()
.
trait Iterator {
fn collect<B>(self) -> B where B: FromIterator<Self::Item>;
fn find<P>(&mut self, predicate: P) -> Option<Self::Item> where P: FnMut(&Self::Item) -> bool;
fn fold<B, F>(self, init: B, f: F) -> B where F: FnMut(B, Self::Item) -> B;
}
trait FromIterator<A> {
fn from_iter<T>(iter: T) -> Self where T: IntoIterator<Item=A>;
}
find()
find()
searches for the first element that satisfies a predicate. It returns Some(_)
if it finds it, or None
otherwise. find()
takes a parameter, which is a closure that takes a reference to each element in the iterator and returns true
or false
.
assert_eq!((0..10).find(|&x| x > 5).unwrap(), 6);
assert!((0..10).find(|&x| x > 10).is_none());
collect()
collect()
returns a collection that contains all the elements in the iterator. The receiving type (the type of the returned collection) must implement the FromIterator
trait.
struct Foo<T> {
v: Vec<T>,
}
impl<A> std::iter::FromIterator<A> for Foo<A> {
fn from_iter<T>(iter: T) -> Self where T: IntoIterator<Item=A> {
let mut foo = Foo{ v: vec![] };
for i in iter {
foo.v.push(i);
}
foo
}
}
let foo: Foo<_> = (0..10).collect();
// or
// let foo = (0..10).collect::<Foo<_>>();
println!("{:?}", foo.v);
fold()
fold()
applies a function to each element in the iterator to produce a single, final value. For example, the following program computes the sum of all the values in the array.
assert_eq!((0..10).fold(0, |sum, x| sum + x), 45);
fold()
takes two arguments. The first argument is the initial value of the return value. The second argument is a closure that takes the return value computed so far and the next element to update the return value.
You can implement MapReduce by combining map()
with fold()
. For example, the following computes the squared sum of an array.
let input = vec![1, 2, 3, 4, 5];
let output = input.iter().map(|x| x * x).fold(0, |x, y| x + y);
assert_eq!(output, 55);