Article | XuanYuanYuLong \

Source: Python Technology “ID: pythonall”

After all this writing, have you ever wondered what Python is really doing?

Or, to put it another way, when we execute Python code, how does it work?

Python is known to be an interpreted language

The so-called “interpreted”, of course, is different from the COMPILED language represented by C language. Compiled languages need to convert the entire program file into a binary file that can be executed directly by the machine; Interpreted languages, on the other hand, “interpret” and perform the behavior described by the code line by line by the corresponding interpreter.

For this reason, interpreted languages such as Python often require execution of the corresponding statement to reveal some obvious errors to new users.

After all, how does the Python interpreter “interpret” Python code?

In fact, Python has its own virtual machine, similar to Java’s execution mechanism. The virtual machine actually performs a form of bytecode.

There is still a “compilation” process in the execution of Python programs: compiling Python code into bytecode.

Python also provides a module called dis to view and analyze Python bytecode.

1. disThe module

For example, the dis module has a function of the same name, dis, that can be used to disassemble objects in the current namespace into bytecode.

import dis


def add(add_1, add_2) :
    sum_value = add_1 + add_2


dis.dis(add)
Copy the code

The execution result is as follows:

  4           0 LOAD_FAST                0 (add_1)
              2 LOAD_FAST                1 (add_2)
              4 BINARY_ADD
              6 STORE_FAST               2 (sum_value)
              8 LOAD_CONST               0 (None)
             10 RETURN_VALUE
Copy the code

Where the leading number “4” indicates that the contents of the bytecode correspond to the contents of line 4 in the script.

The following column of numbers indicates the address of the corresponding instruction. Longitudinal observation can find a rule: the address of the next instruction is always 2 larger than the address of the previous instruction. Is this a coincidence?

Obviously not. The changes recorded in the official document “DIS — Python Bytecode Disassembler” show that, starting with Python version 3.6, “each instruction uses 2 bytes”. So the address of each instruction is added by 2 to the address of the last instruction.

After that, there’s a list of words that represent the meaning of the instructions, which are actually human-readable names of the corresponding instructions. As the name suggests, LOAD_FAST means to load something/object somewhere, and “FAST” probably means this is a quick and convenient command implementation.

At the far right, the operand corresponding to the current command is the operation object. A number is also a representation similar to an address, and the string in parentheses represents the specific name of the object in Python code.

This allows us to read the generated bytecode roughly:

Python first loads the first argument to add somewhere, followed by the second argument, add_2, after the first argument. It then calls an instruction called BINARY_ADD, which adds the two previously loaded parameters. Then you store the sum and sum_value in a different location. Finally, a constant None is loaded and returned.

In fact, after reading the above implementation process, we can easily think of a common data structure – stack.

Something like this:

01

Of course, that’s not the point of this article — it will take several longer articles to really explore Python’s implementation mechanism.

The dis.dis function can not only view the bytecode corresponding to each object in the current script, but also directly pass in a corresponding string of code for disassembly:

# test_dis.py
import dis




s = """ def add(add_1, add_2): sum_value = add_1 + add_2 print("Hello World!" ) import sys """


dis.dis(s)
Copy the code

Compilation results:

2 0 LOAD_CONST 0 (<code object add at 0x0000019FF66DFDB0, file "<dis>", line 2>) 2 LOAD_CONST 1 ('add') 4 MAKE_FUNCTION 0 6 STORE_NAME 0 (add) 5 8 LOAD_NAME 1 (print) 10 LOAD_CONST 2 ('Hello World! ') 12 CALL_FUNCTION 1 14 POP_TOP 7 16 LOAD_CONST 3 (0) 18 LOAD_CONST 4 (None) 20 IMPORT_NAME 2 (sys) 22 STORE_NAME 2 (sys) 24 LOAD_CONST 4 (None) 26 RETURN_VALUECopy the code

2. compilefunction

In addition to giving the strings formed by the program to disassemble directly in the program, we can also use the built-in function compile to form the compilation object of the corresponding script, and then use dis.dis to view its bytecode content.

# test_compile.py
import dis


with open("test_dis.py"."r", encoding="utf-8") as f:
    s = f.read()


compile_obj = compile(s, "test_dis.py"."exec")


dis.dis(compile_obj)
Copy the code

Bytecode output result:

1 0 LOAD_CONST 0 (0) 2 LOAD_CONST 1 (None) 4 IMPORT_NAME 0 (dis) 6 STORE_NAME 0 (dis) 11 8 LOAD_CONST 2 ('\ndef add(add_1, add_2):\n sum_value = add_1 + add_2\n\nprint("Hello World!" )\n\nimport sys\n') 10 STORE_NAME 1 (s) 13 12 LOAD_NAME 0 (dis) 14 LOAD_METHOD 0 (dis) 16 LOAD_NAME 1 (s) 18 CALL_METHOD  1 20 POP_TOP 22 LOAD_CONST 1 (None) 24 RETURN_VALUECopy the code

conclusion

The DIS module provides a way to look inside Python, and when used properly, in combination with other methods, it can be a quick and effective way to understand some of Python’s confusing aspects.

I hope you can make good use of such useful tools.

reference

[dis — Python bytecode disassembler]docs.python.org/zh-cn/3/lib…

[about a Python program works] www.cnblogs.com/restran/p/4…

PS: Reply “Python” within the public number to enter the Python novice learning exchange group, together with the 100-day plan!

Old rules, brothers still remember, the lower right corner of the “watching” click, if you feel the content of the article is good, remember to share moments to let more people know!

[Code access ****]

Identify the qr code at the end of the article, reply: 200827