Author: MING Personal public account: Python programming time personal wechat: MrBensonwon

Hello, concurrent programming is part eight.

Note: This series has been updated on wechat official account. To view the latest articles, please pay attention to the public account for access.


Until the last article, we finally got to one of the most advanced, important, and certainly most difficult aspects of concurrent programming in Python (coroutines).

By the time you get to this post, make sure you know something about generators. Of course not, and that’s fine. Just a few minutes of reading my last article will give you an introduction to generators, beginner coroutines.

Again, all code in this series is written under Python3, and IT is recommended that you jump into Python3 as soon as possible.


Why use coroutines

In the last article, we successfully moved from the basics of generators and their use to coroutines.

But there must be a lot of people out there who just know what a coroutine is, but don’t know why they use it. Right? In other words, don’t know when to use coroutines? Compared with multithreading, what are its advantages?

Before I start with yield from, I want to address this issue that has confused many people.

Let me give you an example. Suppose we were a reptile. We need to crawl multiple web pages, for example two web pages (two spider functions), get the HTML (IO time), and then parse the HTML and rows to get the data we’re interested in.

Our code structure is simplified as follows:

def spider_01(url):
    html = get_html(url)
    ...
    data = parse_html(html)

def spider_02(url):
    html = get_html(url)
    ...
    data = parse_html(html)
Copy the code

As we all know, get_html() wait to return to the page is very IO consuming, a page is fine, if we climb the page data is very large, this wait time is very surprising, is a huge waste of time.

Smart programmers, of course, would like to pause at get_html() and do something else instead of waiting stupidly for the page to return. Wait a while, then go back to where you paused, receive the HTML returned, and then you can continue parse_html(HTML).

Using conventional methods, it is almost impossible to achieve the effect we want above. So Python has been thoughtful enough to give us this functionality from the language itself, which is the yield syntax. You can pause in a function.

Think about it, if there were no coroutines, we would write a concurrent program. There may be the following problems

1) Asynchronous concurrency with the most conventional synchronous programming is not ideal or extremely difficult. 2) Due to the existence of GIL lock, the operation of multi-threading requires frequent lock unlocking and thread switching, which greatly reduces the concurrency performance;

And the emergence of coroutine, just can solve the above problems. Its characteristics are

  1. Coroutines switch tasks in a single thread
  2. Use synchronous way to achieve asynchrony
  3. Eliminates the need for locks, improving concurrency performance


How to yield from

Yield from is a syntax that only appeared in Python3.3. So this feature is not available in Python2.

Yield from is followed by an iterable, which can be a plain iterable, an iterator, or even a generator.

. Simple application: Concatenate iterable

We can compare an example of using yield with an example of using yield from.

Using yield

# string
astr='ABC'
# list
alist=[1.2.3]
# dictionary
adict={"name":"wangbm"."age":18}
# generator
agen=(i for i in range(4.8))

def gen(*args, **kw):
    for item in args:
        for i in item:
            yieldI new_list=gen(astr, Alist, ADict, agen) print(list(new_list)# ['A', 'B', 'C', 1, 2, 3, 'name', 'age', 4, 5, 6, 7]
Copy the code

Using yield from

# string
astr='ABC'
# list
alist=[1.2.3]
# dictionary
adict={"name":"wangbm"."age":18}
# generator
agen=(i for i in range(4.8))

def gen(*args, **kw):
    for item in args:
        yield from item

new_list=gen(astr, alist, adict, agen)
print(list(new_list))
# ['A', 'B', 'C', 1, 2, 3, 'name', 'age', 4, 5, 6, 7]
Copy the code

By comparing the above two methods, it can be seen that if the yield from is followed by the iterable, it can yield each element in the iterable one by one. The code is simpler and clearer than the yield.

Complex applications: nesting of generators

If you think that only yield from has the above functions, you are underestimating it. There are more powerful functions still to come.

Nesting of the generation is achieved when yield from is followed by a generator.

Of course, it is not necessary to use yield from to implement nesting of generators, but using yield from allows us to avoid dealing with unexpected exceptions ourselves and to focus on implementing the business code.

If you yield your code, you make it harder to write, less efficient, and less readable. Since Python has been so thoughtful, we should certainly take advantage of it.

So before WE go into it, we need to know a couple of concepts

Delegate generator: a generator function that contains a yield from expression. Child generator: a generator function followed by yield from

You may not know what they all mean, but that’s okay. Here’s an example.

This is an example of a real-time average calculation. For example, if 10 is passed in the first time, the average returned is 10. If 20 is passed in the second time, the average returned is (10+20)/2=15. If 30 is passed in the third time, the average returned is (10+20+30)/3=20

# subgenerator
def average_gen(a):
    total = 0
    count = 0
    average = 0
    while True:
        new_num = yield average
        count += 1
        total += new_num
        average = total/count

# delegate generator
def proxy_gen(a):
    while True:
        yield from average_gen()

# the caller
def main(a):
    calc_average = proxy_gen()
    next(calc_average)            # Generator under pre-excitation
    print(calc_average.send(10))  # Print: 10.0
    print(calc_average.send(20))  # Print: 15.0
    print(calc_average.send(30))  # Print: 20.0

if __name__ == '__main__':
    main()
Copy the code

Read the above code carefully and you should easily understand the relationship between caller, delegate generator, and child generator. I won’t go into that

The purpose of a delegate generator is to establish a two-way channel between the caller and the child generator.

What do I mean by a two-way channel? The caller can send a message directly to the child generator by sending (), and the yield value of the child generator is returned directly to the caller.

You’ll probably often see some code, and you’ll also see assignment before yield from. What is this usage?

You might think that the value returned by the yield of the subgenerator would be intercepted by the delegate generator. You can write a demo and run it yourself. It’s not what you think. Because as we said before, the delegate generator only acts as a bridge, it establishes a two-way channel, and it doesn’t have the right or the means to intercept whatever the subgenerator yields.

To explain this usage, I’ll use the example above again, with some modifications. I added some comments, hopefully you can read them.

As usual, let’s take an example.

# subgenerator
def average_gen(a):
    total = 0
    count = 0
    average = 0
    while True:
        new_num = yield average
        if new_num is None:
            break
        count += 1
        total += new_num
        average = total/count

    # Each return means that the current coroutine ends.
    return total,count,average

# delegate generator
def proxy_gen(a):
    while True:
        The variable to the left of yield from is assigned and the following code is executed only when the child generator terminates (return).
        total, count, average = yield from average_gen()
        print("Calculated!! \n Total of {} values, sum: {}, average: {}".format(count, total, average))

# the caller
def main(a):
    calc_average = proxy_gen()
    next(calc_average)            # preexcited coroutine
    print(calc_average.send(10))  # Print: 10.0
    print(calc_average.send(20))  # Print: 15.0
    print(calc_average.send(30))  # Print: 20.0
    calc_average.send(None)      # end coroutine
    If calc_average. Send (10) is called again, a coroutine will be restarted because the previous coroutine has ended

if __name__ == '__main__':
    main()
Copy the code

After running, output

10.0 15.0 20.0 calculated!! A total of 3 values are passed in, sum: 60, average: 20.0Copy the code


Why use yield from

Now, I’m sure you’re asking, well, what else do I need a delegate generator for, since it’s just a two-way street? Why don’t I just call the child generator directly?

High energy warning ~~~

Now let’s discuss what’s so good about yield from that we must use it.

.because it helps us handle exceptions

If we drop the delegate generator and call the child generator directly. Then we need to change the code to something like this, where we need to catch and handle the exception ourselves. Rather than yield from.

# subgenerator
def average_gen(a):
    total = 0
    count = 0
    average = 0
    while True:
        new_num = yield average
        if new_num is None:
            break
        count += 1
        total += new_num
        average = total/count
    return total,count,average

# the caller
def main(a):
    calc_average = average_gen()
    next(calc_average)            # preexcited coroutine
    print(calc_average.send(10))  # Print: 10.0
    print(calc_average.send(20))  # Print: 15.0
    print(calc_average.send(30))  # Print: 20.0

    # ---------------- Note -----------------
    try:
        calc_average.send(None)
    except StopIteration as e:
        total, count, average = e.value
        print("Calculated!! \n Total of {} values, sum: {}, average: {}".format(count, total, average))
    # ---------------- Note -----------------

if __name__ == '__main__':
    main()
Copy the code

At this point, you might say, isn’t there just an exception called StopIteration? It’s not a big deal to catch yourself.

You wouldn’t say that if you knew what Yield from has done quietly for us behind the scenes.

To see what yield from does for us, see the following code.

# Some notes
_y: the value produced by the child generator _r: yield from the final value of the expression _s: the value sent by the caller via send() _e: exception object """

_i = iter(EXPR)

try:
    _y = next(_i)
except StopIteration as _e:
    _r = _e.value

else:
    while 1:
        try:
            _s = yield _y
        except GeneratorExit as _e:
            try:
                _m = _i.close
            except AttributeError:
                pass
            else:
                _m()
            raise _e
        except BaseException as _e:
            _x = sys.exc_info()
            try:
                _m = _i.throw
            except AttributeError:
                raise _e
            else:
                try:
                    _y = _m(*_x)
                except StopIteration as _e:
                    _r = _e.value
                    break
        else:
            try:
                if _s is None:
                    _y = next(_i)
                else:
                    _y = _i.send(_s)
            except StopIteration as _e:
                _r = _e.value
                break
RESULT = _r
Copy the code

The above code, a little complex, interested students can be combined with the following instructions to look at.

  1. The value produced by the iterator is returned directly to the caller
  2. Any value sent to the delegate producer (that is, the external producer) using the send() method is passed directly to the iterator. If send is None, the iterator next() method is called; If not None, the iterator’s send() method is called. If a call to an iterator raises a StopIteration exception, delegate the producer to resume execution of the statement following yield from; Any other exceptions raised by the iterator are passed to the delegate producer.
  3. The child generator may be just an iterator, not a generator as a coroutine, so it does not support the.throw() and.close() methods, which may raise AttributeError exceptions.
  4. Any exceptions thrown to the delegate producer other than the GeneratorExit exception will be passed to the iterator’s throw() method. If the iterator throw() call raises a StopIteration exception, the delegate producer resumes and continues execution, and other exceptions are passed to the delegate producer.
  5. If the GeneratorExit exception is thrown to the delegate producer, or if the delegate producer’s close() method is called, the iterator will also be called if it has close(). If the close() call produces an exception, the exception is passed to the delegate producer. Otherwise, the delegate producer will throw the GeneratorExit exception.
  6. When an iterator terminates and throws an exception, the yield from expression value is the first argument in its StopIteration exception.
  7. A return expr statement in a generator will exit from the generator and raise a StopIteration(EXPr) exception.

Not interested to see the students, as long as know that yield the from to help us do a lot of exception handling, and comprehensive, which, if we want oneself to achieve a more is to write code, write bad code readability, and we just don’t say these, the most important is there is likely to be missing, as long as abnormal which didn’t consider that It could crash or something.

————————————————————————————————