An overview of the

When you use Python, you’ll be dealing with terms like list/tuple/dict, container, iterable, iterator, generator, etc. There are so many concepts mixed together that it can get confusing. Here’s a picture to show how they relate.

Today we are going to focus on iterators and generators. Before that we are going to look at containers and iterables.

So what is a container?

A container is a data structure that organizes multiple elements together. The elements in the container can be obtained iteratively one by one. The in and not in keywords can be used to determine whether the elements are contained in the container. While these types of data structures typically store all elements in memory (with a few exceptions, not all elements are in memory, such as iterators and generator objects), the common string, set, list, tuple, and dict objects are container objects.

# string
>>> 'h' in 'hello'
True
>>> 'z' in 'hello'
False
>>> 'z' not in 'hello'
True

# list
>>> 1 in [1.2.3]
True
>>> 5 in [1.2.3]
False
>>> 5 not in [1.2.3]
True

# set
>>> 1 in {1.2.3}
True
>>> 5 in {1.2.3}
False
>>> 5 not in {1.2.3}
True

# tuple
>>> 1 in (1.2.3)
True
>>> 5 in (1.2.3)
False
>>> 5 not in (1.2.3)
True

# dict
>>> 1 in {1:'one'.2:'two'.3:'three'}
True
>>> 5 in {1:'one'.2:'two'.3:'three'}
False
>>> 5 not in {1:'one'.2:'two'.3:'three'}
True
Copy the code

Although most containers provide some way to get every element in them, it is not the containers themselves that provide this capability, but rather the iterables that give the containers this capability, and not all containers are iterable.

So what is an iterable?

An iterable is an object that returns an iterator. Let’s look at an example:

>>> x = [1.2.3]
>>> a = iter(x)
>>> b = iter(x)
>>> next(a)
1
>>> next(a)
2
>>> next(b)
1
>>> type(x)
<class 'list'> > > >type(a)
<class 'list_iterator'> > > >type(b)
<class 'list_iterator'>
Copy the code

Here x is an iterable. An iterable, like a container, is a colloquial term, not a specific data type. List is an iterable, and dict is an iterable. A and B are two independent iterators. Inside the iterator, there is a state that is used to record the position of the current iteration to facilitate the retrieval of the correct elements in the next iteration. Iterators have a specific type, such as list_iterator and set_iterator. Iterables implement the __iter__() method, which returns an iterator object.

When running the code:

x = [1.2.3]
for ele in x:
    ...
Copy the code

The actual implementation is:

Iterator iterator iterator iterator iterator iterator iterator iterator iterator iterator iterator iterator iterator iterator iterator iterator iterator iterator iterator iterator iterator iterator iterator iterator iterator iterator iterator iterator __iter__() is called only once, while __next__() is called n times.

An Iterator.

An iterator is an object with state that returns the next value in the container when you call next(). Any object that implements __iter__() and __next__()(python2.x implements next()) is an iterator. __iter__() returns the iterator itself, __next__() returns the next value in the container and raises a StopIteration exception if there are no more elements in the container.

The difference between an iterator and a list is that, unlike a list, iterators are not built to load all elements into memory at once. Instead, they return elements in a lazy evaluation manner, which is the advantage of iterators. For example, a list containing 10 million integers takes up more than 100M of memory, whereas an iterator requires only a few dozen bytes of space. Instead of loading all the elements into memory, it waits until the next() method is called (call by need, essentially a for loop that keeps calling the iterator’s next() method).

Functions in the iterTools module return iterators. In order to have a more intuitive feeling of the execution process inside the iterator, let’s define a custom iterator, using Fibonacci number as an example:

#-*- coding:utf8 -*-
class Fib(object) :
    def __init__(self, max=0) :
        super(Fib, self).__init__()
        self.prev = 0
        self.curr = 1
        self.max = max
        
    def __iter__(self) :
        return self

    def __next__(self) :
        if self.max > 0:
            self.max- =1
            The value of the current element to be returned
            value = self.curr
            The value of the next element to return
            self.curr += self.prev
            # set the value of the previous element of the next element
            self.prev = value
            return value
        else:
            raise StopIteration
    
    # compatible Python2. X
    def next(self) :
        return self.__next__()

if __name__ == '__main__':
    fib = Fib(10)
    Call next()
    for n in fib:
        print(n)
    # raise StopIteration
    print(next(fib))
Copy the code

The Fibonacci sequence refers to 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89… A sequence like this, where you start with the third term, and each term is equal to the sum of the first two terms.

Generator

Now that we know about iterators, let’s look at generators. Normal functions return a value, and there are also functions that yield a value. These are called generator functions. The function returns a generator object when called. Generators are a special kind of iterator, but they are more elegant in that they don’t need to implement the __iter__() and __next__() methods as normal iterators do, just the yield keyword. A generator must be an iterator (the converse is not true), so any generator is also a lazy mode of generating values. Here’s an example of using a generator to implement the Fibonacci sequence:

#-*- coding:utf8 -*-
def fib(max) :
    prev, curr = 0.1
    while max > 0:
        max- =1
        yield curr
        prev, curr = curr, prev + curr
        
if __name__ == '__main__':
    fib = fib(6)
    Call next()
    for n in fib:
        print(n)
    # raise StopIteration
    print(next(fib))
Copy the code

The generator expression is a generator version of a list derivation. It looks like a list derivation, but it returns a generator object instead of a list object.

>>> x = (x*x for x in range(10))
>>> type(x)
<class 'generator'> > > >y = [x*x for x in range(10)] > > >type(y)
<class 'list'>
Copy the code

Let’s look at another example:

#-*- coding:utf8 -*-
import time

def func(n) :
    for i in range(0, n):
        # yield is equivalent to return, and the next loop starts at the yield line
        arg = yield i
        print('func', arg)

if __name__ == '__main__':
    f = func(6)
    while True:
        print('main-next:'.next(f))
        print('main-send:', f.send(100))
        time.sleep(1)
Copy the code

The running results are as follows:

main-next: 0
func 100
main-send: 1
func None
main-next: 2
func 100
main-send: 3
func None
main-next: 4
func 100
main-send: 5
func None
Traceback (most recent call last):
  File "demo.py", line 13.in <module>
    print('main-next:'.next(f))
StopIteration
Copy the code

Yield means that the return returns a value, and remember the return location. The next iteration begins after that location (the next line). Both the next and send methods can return the next element. The difference is that SEND can pass parameters to the yield expression as the value of the expression, whereas the yield parameter is the value returned to the caller.

conclusion

  1. Iterable is implemented__iter__()Method object by callingiter()Method returns an Iterator.
  2. Iterators are implemented__iter__()Methods and__next()__Method object.
  3. for... in...Iterating over is actually converting an iterable into an iterator and calling it againnext()Method.
  4. A Generator is a special iterator whose implementation is simpler and more elegant.
  5. yieldIt’s a generator implementation__next__()The key to the method. It serves as a pause recovery point for generator execution, and can either assign to or return the yield expression’s value.