逸之

V1

2022/01/05阅读:58主题:红绯

rust iter

Rust Iterators: A Guide

生锈迭代器: 一个指南

By the time you walk out of here, you should understand what iterators are good for, how they work internally, and how to create your own!

当您离开这里时,您应该了解迭代器有什么好处,它们在内部是如何工作的,以及如何创建您自己的迭代器!

Iterators 迭代器Rust 铁锈January 6th, 2021 2021年1月6日

Iterators are a fairly central concept to Rust. If you're looping over something, you're very likely already using an iterator. If you're transforming collections, you probably should be using them. If your function returns a lazily evaluated sequence of things, you should consider returning an iterator - especially if that sequence could be lazily evaluated.

迭代器是 Rust 的核心概念。如果您正在对某些内容进行循环,那么您很可能已经使用了迭代器。如果您正在转换集合,您可能应该使用它们。如果您的函数返回一个延迟计算的事物序列,那么您应该考虑返回一个迭代器 -- 特别是如果这个序列可以延迟计算的话。

WYWL: What you will learn 你会学到什么#[1]

We'll take a look at how iterators are implemented, how to iterate over a collection, what sorts of iterators exist in the standard library, usage and common patterns for transforming data, and finally a few examples of useful crates that provide their functionality via powerful iterators.

我们将介绍迭代器是如何实现的,如何在集合中进行迭代,标准库中存在哪些类型的迭代器,转换数据的用法和常用模式,最后是一些通过强大的迭代器提供其功能的有用的箱子示例。

Prerequisites 先决条件#[2]

Skill-wise, you'll ideally have an understanding of

在技能方面,你最好能够理解

  • structs and enums,

    结构体和枚举,

  • the Option<T> generic type,

    选项 < t > 泛型,

  • traits, and

    特征,以及

  • just a pinch of closures,

    只需要一小撮封口,

I highly recommend having some sort of environment to run snippets of code in. The simplest thing to use is the Rust Playground. If you'd like a local environment, refer to The Book for guidance on setting that up.

我强烈建议在某种环境中运行代码片段。使用最简单的就是锈迹游乐场。如果你想要一个当地的环境,参考书中的建立指导。

What is an iterator? 什么是迭代器?#[3]

Essentially, an iterator is a thing that allows you to traverse some sort of a sequence. Note that since Rust's iterators are lazy, this sequence could be generated on the fly - you could just as well traverse an existing array of finite length or create an iterator that keeps spewing out random numbers infinitely.

本质上,迭代器是允许您遍历某种序列的东西。注意,因为 Rust 的迭代器是懒惰的,所以这个序列可以动态生成——您也可以遍历一个有限长度的现有数组,或者创建一个不断输出无限个随机数的迭代器。

Laziness in programming is this general idea of delaying a computation until it's actually needed. A lazy iterator doesn't need to know all the elements it's going to return when it's first initialized - it can compute every next element when/if it's asked for.

编程中的懒惰是这样一种普遍的想法,即将计算推迟到实际需要的时候。惰性迭代器不需要知道它第一次初始化时将返回的所有元素——它可以在需要时计算每一个下一个元素。

The Iterator trait迭代器特性#[4]

In Rust, iterators are typically implemented using the Iterator trait. All we need to implement that trait for our custom type is provide an associated type Item (this is the type of the elements of the sequence, returned by the iterator) and the next method.

在 Rust 中,迭代器通常使用 Iterator trait 实现。我们为自定义类型实现 trait 所需要的就是提供一个关联的类型 Item (这是序列元素的类型,由迭代器返回)和下一个方法。

Let's try and implement an iterator over numbers from 1 to 10.

让我们尝试在1到10之间的数字上实现一个迭代器。

xxxxxxxxxx

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

1718

19

20

21

The next method is expected to yield the next element of the sequence. It takes a mutable reference to self in case we need to keep track of some state between next calls, which is normally the case. We'll soon find it useful.

下一个方法将产生序列的下一个元素。它采用一个可变的 self 引用,以防我们需要跟踪下一个调用之间的某些状态,通常情况下是这样的。我们很快就会发现它很有用。

The return type of next is Option<Self::Item>. If an iterator is finite, it needs to return None to indicate it has no more elements to return. Right now, this iterator will immediately finish, not yielding any items. Let's fix this.

Next 的返回类型是 Option < self: : item > 。如果迭代器是有限的,它需要返回 None 来表示它没有更多的元素要返回。现在,这个迭代器将立即完成,不产生任何项。让我们来解决这个问题。

xxxxxxxxxx

1

2

34

5

6

7

8

910

11

12

This should be pretty self-explanatory. Every call to next increments self.currentand yields it until it grows beyond 10.

每次调用下一个增量都会调用 current.currentand 生成它,直到它增长到10以上。

Looping 循环#[5]

So we have an iterator. Let's use it. The most basic thing we can do is simply loop over all of its elements:

所以我们有一个迭代器。让我们利用它。我们能做的最基本的事情就是简单地遍历它的所有元素:

xxxxxxxxxx

1

2

3

4

5

There's actually an Iterator implementation for the [Range](https://doc.rust-lang.org/std/ops/struct.Range.html) type in Rust! A Range is what you get when you type something like 1..5. Instead of writing the above custom iterator, we could have simply done this:

在 Rust 中实际上有一个 Range 类型的迭代器实现!一个范围是当你输入类似1的东西时得到的。. 5.不用编写上面的自定义迭代器,我们可以简单地这样做:

xxxxxxxxxx

1

2

3

4

5

IntoIterator, iter() and _iter_mut()_Iterator,iter ()和 iter _ mut ()#[6]

Sometimes you'll want to provide the user of your code the ability to iterate over your type, but without that type itself being an iterator. This will be the case with collections. If you implement your own vector type, you probably don't want that type to needlessly hold an extra iteration variable just in case someone wants to iterate over it.

有时候,您希望为代码用户提供遍历类型的能力,但是没有使用类型本身作为迭代器。这将是与集合的情况。如果您实现自己的向量类型,您可能不希望该类型不必要地保存一个额外的迭代变量,以防有人想要遍历它。

What we want is a way to create an iterator out of the collection. There are three mechanisms you'll typically see.

我们需要的是一种从集合中创建迭代器的方法。你通常会看到三种机制。

  • The IntoIterator trait is implemented for the type and provides you with the into_iter method. This one consumes the data and wraps it in an owning iterator.

    IntoIterator trait 是为该类型实现的,并为您提供了 into _ iter 方法。这个函数使用数据并将其包装在一个所属迭代器中。

  • The iter method is defined directly on the type. This method will borrow the data immutably and return an iterator that provides immutable references.

    Iter 方法是直接在类型上定义的。此方法将借用数据不可变,并返回一个提供不可变引用的迭代器。

  • The iter_mut method is defined directly on the type. This method will borrow the data mutably and return an iterator that provides mutable references.

    Iter _ mut 方法是直接在类型上定义的。此方法可变地借用数据并返回提供可变引用的迭代器。

If we want to iterate over a vector of chars, we could do something like this:

如果我们想迭代一个字符的向量,我们可以这样做:

xxxxxxxxxx

1

2

34

5

6

7

If we leave out the into_iter() call, Rust will call that method implicitly anyway.

如果我们忽略 into _ iter ()调用,Rust 无论如何都会隐式地调用该方法。

xxxxxxxxxx

1

2

34

5

6

7

This implicit call is important to keep in mind. Since into_iter() consumes the data, we cannot use the original vector later.

记住这个隐含的呼吁是很重要的。由于 into _ iter ()会消耗数据,所以以后不能使用原始矢量。

xxxxxxxxxx

1

2

34

5

6

78

9

10

One solution would be to explicitly call v.iter() so that the iterator is borrowing instead. Another is to provide a reference to v rather than the owned value.

一种解决方案是显式调用 v.iter () ,这样迭代器就是借用。另一个是提供一个 v 的参考值,而不是拥有的价值。

xxxxxxxxxx

1

2

34

5

6

78

9

This way, the compiler can no longer implicitly call into_iter() since it doesn't get an owned value. It gets an immutable reference, so the best it can do is implicitly call iter() on it - and that's what we want.

这样,编译器就不能隐式地调用 _iter () ,因为它不会得到一个所有值。它获得一个不可变的引用,因此它能做的最好的事情就是对它隐式调用 iter ()——这正是我们想要的。

Then there's mutability. Following the same pattern, here Rust will implicitly call iter_mut() and give us an iterator that is mutably borrowing.

还有可变性。按照相同的模式,Rust 将隐式地调用 iter _ mut () ,并给出一个可变地借用的迭代器。

xxxxxxxxxx

1

2

34

5

6

78

9

Adapters 适配器#[7]

If all you could do with iterators was loop over them with the for keyword, they wouldn't be all that useful. But there is a plethora of adapters that transform an iterator into another kind of iterator, altering its behavior.

如果使用迭代器所能做的就是使用 for 关键字对它们进行循环,那么它们就不会那么有用了。但是有很多适配器将迭代器转换为另一种迭代器,从而改变其行为。

Most adapters you'll work with are in the standard library and exist as methods provided for the implementers of the [Iterator](https://doc.rust-lang.org/std/iter/trait.Iterator.html#provided-methods) trait. There are some crates out there (such as itertools) that provide extra adapters via extensions.

您将使用的大多数适配器都位于标准库中,并且作为 Iterator trait 实现者提供的方法存在。有一些板条箱(比如迭代工具)通过扩展提供额外的适配器。

We're going to go through a few useful adapters, but I highly recommend taking a look at the full list in Rust documentation.

我们将介绍一些有用的适配器,但我强烈建议您查看 Rust 文档中的完整列表。

We can construct an infinite iterator using [std::iter::repeat](https://doc.rust-lang.org/std/iter/fn.repeat.html). It would be a bad idea to iterate over it directly.

我们可以使用 std: : iter: : repeat 构造一个无限迭代器。

x

1

2

3

4

5

6

If we wanted to print only a few 1s, however, we can use the take adapter for that.

但是,如果我们只想打印几个1,我们可以使用 take 适配器。

xxxxxxxxxx

1

2

3

4

5

What take does under a hood is wrap the Repeat iterator in a Take wrapper. Take is also an iterator, but this one finishes after returning a set number of elements.

在引擎盖下所做的是将 Repeat 迭代器包装在一个 Take 包装器中。Take 也是一个迭代器,但是这个迭代器在返回一组元素之后结束。

A very typical feature of functional programming is the ability to map, that is to apply a function to every element of a sequence.

函数式编程的一个非常典型的特性是映射能力,即将函数应用于序列的每个元素。

xxxxxxxxxx

1

2

34

5

6

7

89

10

11

12

13

If we wanted to find only elements that fulfill a certain criterion, we can use the filter adapter.

如果我们只想找到满足某个条件的元素,我们可以使用筛选器适配器。

xxxxxxxxxx

1

2

34

5

6

7

Collecting 收集#[8]

We know how to turn a collection into an iterator. How do we turn an iterator into a collection? Enter the collect method, provided by the Iterator trait.

我们知道如何将集合转换为迭代器。如何将迭代器转换为集合?输入由 Iterator trait 提供的 collect 方法。

We could try to convert from a vector to an iterator and back.

我们可以尝试将向量转换为迭代器,然后返回。

xxxxxxxxxx

1

2

3

4

5

67

8

This, however, produces an error.

然而,这会产生一个错误。

xxxxxxxxxx

1

2

3

4

5

What Rust is telling us here is that it doesn't know what we're trying to collect into. collect has a generic return type and could give you a number of things: a vector, a linked list, a string, etc. You could even create a custom type that can be collected into.

拉斯特在这里告诉我们的是,它不知道我们试图收集什么。Collect 有一个通用的返回类型,可以给你一些东西: 向量、链表、字符串等等。您甚至可以创建一个自定义类型,可以将其收集到。

To let Rust know what concrete type we want collect to return, we can use the turbofish syntax.

为了让 Rust 知道我们想收集什么样的具体类型来返回,我们可以使用 turbofish 语法。

xxxxxxxxxx

1

We can make this slightly shorter. The compiler should be able to figure out that we want a vector of chars, specifically, and not a vector of integers. When filling out type parameters, we can use an underscore to tell Rust, "Figure this part out yourself!"

我们可以把它稍微缩短一点。编译器应该能够指出我们需要的是字符的向量,而不是整数的向量。在填写类型参数时,我们可以使用一个下划线来告诉 Rust,“自己查这个部分!”

xxxxxxxxxx

1

In a case like this, you might find it tidier to add a type annotation to the variable declaration instead. The whole thing will then look like this:

在这种情况下,您可能会发现向变量声明添加类型注释会更整洁。然后整个事情就会变成这样:

xxxxxxxxxx

1

2

3

4

5

67

8

Using collect, we can convert an array of chars into a string.

使用 collect,我们可以将一个字符数组转换为一个字符串。

xxxxxxxxxx

1

2

3

45

6

We could also collect an iterator of tuples (where the first element needs to be hashable) into a HashMap:

我们还可以将元组迭代器(其中第一个元素需要可以 hasable)收集到 HashMap 中:

xxxxxxxxxx

1

2

3

4

5

67

8

Things that can be collected into implement the FromIterator trait. That means this behavior is extendable! Check out the trait's docs to see which types can be collected into and how to implement new ones.

可以收集到实现 FromIterator trait 中的东西。这意味着这个行为是可扩展的!查看 trait 的文档,看看哪些类型可以被收集,以及如何实现新的类型。

Common patterns 常见模式#[9]

We now have some idea of how iterators work and some operations we can perform on them. Let's put it together and see some typical use cases.

现在我们已经对迭代器的工作原理有了一些了解,并且可以在迭代器上执行一些操作。让我们把它们放在一起,看看一些典型的用例。

Transform and collect 转换和收集#[10]

Let's say we store customer data in Customer structs, which include a customer's name, e-mail address, and how much they owe us.

假设我们在 Customer 结构中存储客户数据,其中包括客户的姓名、电子邮件地址以及他们欠我们多少钱。

xxxxxxxxxx

1

2

3

4

5

6

7

8

9

10

11

12

13

Then let's say we have a list of such customers. We're tasked with producing a vector of all debtor e-mails so we can send them a generic reminder.

那么假设我们有一份这样的客户名单。我们的任务是生成一个所有债务人电子邮件的矢量,这样我们就可以给他们发送一个通用的提醒。

xxxxxxxxxx

1

2

3

4

5

6

78

9

10

How do we do it? The nice, idiomatic way is to get an iterator over customers, apply some adapters that will filter and transform the data, and then collect that back into a vector.

我们该怎么做?最好的惯用方法是在客户上获得一个迭代器,应用一些适配器来过滤和转换数据,然后将数据收集回向量中。

xxxxxxxxxx

1

2

3

4

5

6

78

9

10

11

12

13

Finding things in collections 在收藏品中寻找东西#[11]

Given the same Customer struct as above, and the same vector of customers, we can search the customer data for a specific person. There's no useful method defined directly on the Vec<T> type, but there is a find method defined on the Iterator type.

给定与上面相同的 Customer 结构和相同的客户向量,我们可以搜索特定人员的客户数据。没有直接在 Vec < t > 类型上定义的有用方法,但在 Iterator 类型上定义了一个 find 方法。

xxxxxxxxxx

1

2

3

45

What if we'd like to get the position of an element in the vector? Things get a little trickier, but create an iterator.

如果我们想知道一个元素在向量中的位置呢?事情变得有点棘手,但是创建了一个迭代器。

We then have to enumerate it to keep track of positions. enumerate will wrap every element in a tuple of formthe the (position, element).

然后我们必须列举它来跟踪位置。枚举将把每个元素包装在一个元组中,该元组为 formthe (位置,元素)。

Then we have to change our find closure a little to account for the items now being tuples.

然后,我们必须稍微修改 find closure,以便考虑到元组中的条目。

Finally, once we unwrap the Option<(usize, Customer)>, we still have to extract the position component of the tuple, which is the 0th one.

最后,一旦我们打开 Option < (usize,Customer) > ,我们仍然需要提取元组的位置组件,这是第0个元组。

xxxxxxxxxx

1

2

3

4

5

67

Splitting strings 分割字符串#[12]

The str type comes with a split method that yields an iterator over chunks of that string. All you have to provide is a pattern to split by - commonly a char, a &str or a String.

Str 类型带有一个 split 方法,该方法在该字符串的块上生成一个迭代器。您所需要提供的只是一个可以分割的模式——通常是一个 char、一个 & str 或一个 String。

For example, you could get all the words of a phrase this way:

例如,你可以这样得到一个短语的所有单词:

xxxxxxxxxx

1

2

3

45

6

7

8

And then you could transform them and collect them into a new String:

然后你可以把它们转换成一个新的字符串:

xxxxxxxxxx

1

2

3

4

5

67

8

The rev method is an adapter that reverses an iterator!

Rev 方法是一个适配器,可以反转迭代器!

Examples of third-party iterators 第三方迭代器示例#[13]

These are some examples of third-party libraries providing some functionality via iterators.

这些是通过迭代器提供某些功能的第三方库的一些示例。

  • walkdir - walkdir allows us to traverse a directory easily - it yields an iterator over a chosen directory's entries (files and subdirectories)

    Walkdir-walkdir 允许我们轻松地遍历一个目录——它在选定的目录条目(文件和子目录)上产生一个迭代器

  • logos - lexers created with logos return iterators over tokens

    Logos-用 logos 创建的 lexers 返回令牌上的迭代器

  • csv - when reading a CSV file, we get an iterator over its records

    CSV-当读取 CSV 文件时,我们在它的记录上得到一个迭代器

Conclusion 总结#[14]

This just about exhausts the core concepts and basic usage. Hopefully, you should now be able to not only effectively transform collections, but (with some practice) also identify where providing iterators would make sense in your code.

这正是穷尽的核心概念和基本用法。希望您现在不仅能够有效地转换集合,而且(通过一些实践)还能够确定在代码中提供迭代器的意义所在。

参考资料

[1]

Direct link to heading: #wywl-what-you-will-learn

[2]

Direct link to heading: #prerequisites

[3]

Direct link to heading: #what-is-an-iterator?

[4]

Direct link to heading: #the-iterator-trait

[5]

Direct link to heading: #looping

[6]

Direct link to heading: #intoiterator,-iter-and-iter_mut

[7]

Direct link to heading: #adapters

[8]

Direct link to heading: #collecting

[9]

Direct link to heading: #common-patterns

[10]

Direct link to heading: #transform-and-collect

[11]

Direct link to heading: #finding-things-in-collections

[12]

Direct link to heading: #splitting-strings

[13]

Direct link to heading: #examples-of-third-party-iterators

[14]

Direct link to heading: #conclusion

分类:

后端

标签:

后端

作者介绍

逸之
V1