Iterators and generators are very important concepts in Python, and I’m going to talk about my own understanding of what I’ve learned recently.

The iterator

There are many containers in Python, such as lists, tuples, dictionaries, collections, etc. Containers can be intuitively thought of as units of multiple elements together. All containers are iterable.

We usually use the for in statement to enumerate iterable objects. The underlying mechanism is:

Iterables, on the other hand, return an iterator via the iter() function, which provides a next method. By calling this method, you either get the next object in the container, or you get a StopIteration error.

Here’s an example:

>>> items = [1.2.3]
>>> # Get the iterator
>>> it = iter(items) # Invokes items.__iter__()
>>> # Run the iterator
>>> next(it) # Invokes it.__next__()
1
>>> next(it)
2
>>> next(it)
3
>>> next(it)
Traceback (most recent call last):
    File "<stdin>", line 1.in <module>
StopIteration
>>>
Copy the code

To manually traverse the iterable, use the next() function and catch the StopIteration exception in your code.

In most cases, we will use a for loop to traverse an iterable. Occasionally, however, more precise control of iterations is needed, and it is important to understand the underlying iteration mechanism.

The generator

A generator is simply a lazy version of an iterator.

The advantage over iterators is that generators don’t use as much memory as iterators. For example, declaring an iterator: [I for I in range(100000000)] would declare a list of 100 million elements, each of which would be stored in memory after generation. But we probably don’t need to save that much stuff. We just want the next variable to be generated when you use the next() function, so the generator comes out, which in Python is written as (I for I in range(100000000)).

In addition, generators can take other forms, such as a generator function that returns results to the next() method using the yield keyword. Here’s an example:

def frange(start, stop, increment):
    x = start
    while x < stop:
        yield x
        x += increment

for n in frange(0.2.0.5):
    print(n)

0
0.5
1.0
1.5
Copy the code

Generators have the following advantages over iterators:

  1. Reduce memory
  2. Delay calculation
  3. Effectively improve code readability

Some applications

There is a classic algorithm problem, given two sequences, determine whether the first is a subsequence of the second. Link to LeetCode: leetcode-cn.com/problems/is…

The usual practice is to maintain two Pointers to the beginning of both lists, and the second pointer sweeps all the way through, and if a number is the same as the first pointer, it moves forward until the first pointer finishes, returning True.

This would take at least a dozen lines of code, but if we understood the generator, the solution would be extremely simple.

def is_subsequence(a, b): 
    b = iter(b)
    return all(i in b for i in a)

print(is_subsequence([1.3.5], [1.2.3.4.5]))
Copy the code

The second example is code readability.

Now there is a requirement to find the position of each word in a paragraph of text. Without generators:

def index_words(text):
    result = []
    if text:
        result.append(0)
    for index, letter in enumerate(text, 1) :if letter == ' ':
            result.append(index)
    return result
Copy the code

Using generators:

def index_words(text):
    if text:
        yield 0
    for index, letter in enumerate(text, 1) :if letter == ' ':
            yield index
Copy the code

Without generators, we see result.append(index) first and index second for each result. When using a generator, simply yield index without the distraction of the list append operation, we can see at a glance that the code is returning index.

As a final note, the only caveat to using a generator is that the generator can only be traversed once.