In this article, we introduce Enhanced Generator (Coroutine) in Python, including the basic syntax, usage scenarios, precautions, and similarities and differences with coroutine implementations in other languages.

enhanced generator

The yield and generator scenarios and ideas described above only use the Next method of generator, but generator has even more power. PEP 342 adds a number of methods to generator to make it more like a Coroutine. The main change is that while earlier yield could only return a value (as the producer of the data), the new send method can consume a value when the generator resumes, and the caller (the generator’s caller) can also throw an exception when the generator suspends.

First look at the enhanced yield, which has the following syntax:

back_data = yield cur_ret
Copy the code

When this statement is executed, cur_ret is returned to the caller; And when the generator is restored by next() or send(some_data), assign some_data to back_data.

def gen(data) :
    print 'before yield', data
    back_data = yield data
    print 'after resume', back_data

if __name__ == '__main__':
    g = gen(1)
    print g.next(a)try:
        g.send(0)
    except StopIteration:
        pass
Copy the code

Output: before yield 1 1 after resume 0

Understanding of this place:

G = gen(1) generates a generator object. When g.ext () is executed, print ‘before yield’, data is executed, and when back_data = yield data is executed, yield is encountered. Print 1. When we execute g.end (0), we return to yield. There is a variable that receives the new value of this send, back_data = 0, Print ‘after resume’, back_data(0).

Two points to note:

(1) Next () is equivalent to send(None) (2) The first call requires either a next() statement or send(None). You cannot send a value other than None because there is no Python yield statement to receive this value.

Application scenarios

The generator has the ability to consume data (push) when it can accept data (while recovering from a pending state) rather than just return it. The following example comes from here:

word_map = {}
def consume_data_from_file(file_name, consumer) :
    for line in file(file_name):
        consumer.send(line)

def consume_words(consumer) :
    while True:
        line = yield
        for word in (w for w in line.split() if w.strip()):
            consumer.send(word)

def count_words_consumer() :
    while True:
        word = yield
        if word not in word_map:
            word_map[word] = 0
        word_map[word] += 1
    print word_map

if __name__ == '__main__':
    cons = count_words_consumer()
    cons.next()
    cons_inner = consume_words(cons)
    cons_inner.next()
    c = consume_data_from_file('test.txt', cons_inner)
    print word_map
Copy the code

Understanding of this place:

Cons_inner = consume_words(cons) is passing a generator in a function, In this program, consumer.send(word) uses the generator to continuously send data to word for consumption. In fact, the consume_data_from_file function is designed in the same way. Finally, row data is generated from consume_datA_from_file and word data is generated from consume_words. Finally, word data is consumed by count_words_consumer, a complete system of producers and consumers.

In the above code, the real data consumer is count_words_consumer, the original data producer is consume_datA_from_file, and the data flow is actively from producer to consumer. However, lines 22 and 24 above call next twice, which can be encapsulated with a decorator.

def consumer(func) :
    def wrapper(*args,**kw) :
        gen = func(*args, **kw)
        gen.next(a)return gen
    wrapper.__name__ = func.__name__
    wrapper.__dict__ = func.__dict__
    wrapper.__doc__  = func.__doc__
    return wrapper
Copy the code

The modified code

def consumer(func) :
    def wrapper(*args,**kw) :
        gen = func(*args, **kw)
        gen.next(a)return gen
    wrapper.__name__ = func.__name__
    wrapper.__dict__ = func.__dict__
    wrapper.__doc__  = func.__doc__
    return wrapper

word_map = {}
def consume_data_from_file(file_name, consumer) :
    for line in file(file_name):
        consumer.send(line)

@consumer
def consume_words(consumer) :
    while True:
        line = yield
        for word in (w for w in line.split() if w.strip()):
            consumer.send(word)

@consumer
def count_words_consumer() :
    while True:
        word  = yield
        if word not in word_map:
            word_map[word] = 0
        word_map[word] += 1
    print word_map

if __name__ == '__main__':
    cons = count_words_consumer()
    cons_inner = consume_words(cons)
    c = consume_data_from_file('test.txt', cons_inner)
    print word_map
Copy the code

The consumer decorator is essentially the function that starts and returns the generator object.

generator throw

In addition to the next and send methods, Generator provides two useful methods, throw and close, which strengthen caller control over the generator. The send method can pass a value to the generator, the throw method throws an exception where the generator hangs, and the close method terminates the generator as normal (next Send cannot be called after that). Let’s look at the throw method in more detail.

  • throw(type[, value[, traceback]])

Throws an exception of type TYPE at generator yield and returns the next yield value. If an exception of type type is not caught, it is passed to the caller. In addition, if the generator cannot yield the new value, it raises a StopIteration exception to the caller

@consumer
def gen_throw() :
    value = yield
    try:
        yield value
    except Exception, e:
        yield str(e) # If this line is commented out, StopIteration is raised

if __name__ == '__main__':
    g = gen_throw()
    assert g.send(5) = =5
    assert g.throw(Exception, 'throw Exception') = ='throw Exception'
Copy the code

The first call to send suspends on line 5 after the code returns value (5), and the generator throw is then caught on line 6. If line 7 does not yield again, the StopIteration exception is reraised.

Understanding of this place:

G = gen_throw() this method is wrapped in decorator and actually executes g = gen_throw() and g.ext () can tell that our program is now executing down the value = yield line, g.end (5)–>value = 5, Return 5 to the main program, 5 == 5 True, when we return g.row (Exception) to the last yield line, try… Except catch, yield STR (e) returns throw Exception == throw Exception to the main program, still True. Notice the part above about stopping exceptions.

Matters needing attention

If a generator has already started execution with send, it cannot be rescheduled to that generator from another generator until it is yield again

@consumer
def funcA() :
    while True:
        data = yield
        print 'funcA recevie', data
        fb.send(data * 2)

@consumer
def funcB() :
    while True:
        data = yield
        print 'funcB recevie', data
        fa.send(data * 2)

fa = funcA()
fb = funcB()
if __name__ == '__main__':
    fa.send(10)
Copy the code

Output:

funcA recevie 10

funcB recevie 20

ValueError: generator already executing

The Generator and the Coroutine

Back to the Coroutine, can see wikipedia (en.wikipedia.org/wiki/Corout… , my own understanding is simple (or one-sided) : programmers can control concurrent processes, whether process or thread, whose switching is scheduled by the operating system, while for coroutines, programmers can control when switching out and when switching back. Coroutines are much lighter than process threads and have less context-switching overhead. In addition, because programmers control scheduling, to a certain extent, can avoid a task is interrupted in the middle. What scenarios can coroutines be used in? I think it can be generalized to non-blocking wait scenarios, such as game programming, asynchronous IO, event-driven.

In Python, the generator’s send and throw methods make it look like a coroutine, but a generator is just a semicoroutines. The Python Doc describes the generator as follows:

“All of this makes generator functions quite similar to coroutines; they yield multiple times, they have more than one entry point and their execution can be suspended. The only difference is that a generator function cannot control where should the execution continue after it yields; The control is always transferred to the generator’s caller.”

However, more power can be achieved with enhanced Generator. For example in the yield_dec example mentioned above, you can only passively wait until the time is up and continue executing. In some cases, such as when an event is triggered, we want to resume the execution process immediately, and we care what the event is, we need generator Send. In the other case, we need to terminate the execution process, so we deliberately call close and do some processing in the code, pseudo code is as follows:

@yield_dec
def do(a) :
    print 'do', a
    try: event =yield 5
        print 'post_do', a, the eventfinally:print 'do sth'
Copy the code

As for the other example mentioned earlier, asynchronous calls between services (processes), this is also a good example for using coroutines. The callback approach splits the code, dividing a piece of logic into multiple functions, while the coroutine approach is much better, at least for code reading. In other languages, such as C# and Go, coroutines are standard implementations, and in the case of Go in particular, coroutines are the cornerstone of high concurrency. In Python3. x, support for coroutines is also added with asyncio and async\await. In the 2.7 environment used by the authors, greenlets can also be used