- A Bit About Bytes: Understanding Python Bytecode-Pycon 2018
- The Nuggets translation Project
- Permanent link to this article: github.com/xitu/gold-m…
- Translator: cdpath
- Proofreader: HCMY
PyCon 2018 James Bennett – A Bit About Bytes: Understanding Python Bytecode
0:07 Welcome to Byte Chatter
0:11 Today we’re going to talk about Python bytecode
0:14 Titles apart from playing word games
0:17 more meaning
0:20 Cut the gossip
0:22 Welcome Django core developer James Bennett
0:25 start the speech
0:36 I want to start with a slightly existential question
0:38 Why are we at PyCon
0:41 because I love Python
0:45 right
0:52 Why Do you love Python?
0:55 because we all get it
0:57 Reason reading Code takes more time than writing it
1:03 So try to make your code more readable
1:05 Of course, we love Python
1:08 is because Python was made for a simple idea
1:12 code should be easy to read
1:19 Python is clear, readable, and easy to understand
1:24 Even if you’re not a programmer
1:25 You can also take a look at Python code
1:28 Understand the logic
1:30, right?
1:36 That’s Python
1:41 At least CPython downloaded from Python.org
1:50 now I’m going to teach you, where does it come from
How does 1:53 work
1:56 What is the use of understanding it
1:59 And finally in practice
2:01 or apply it in theory
2:05 but before we do that
2:07 We’re going to learn a little bit about how computers work, right
2:08 Also know how programming languages work
2:12 I love this tweet
So beautiful and so true
2:19 But we do need to understand how computers work
The CPU processor inside the computer is a silicon chip
2:26 Carved with carefully arranged circuits
2:32 Input a specific current
At 2:35 you get another mode of current
2:37 And the pattern is predictable
Give these patterns names and meanings
2:45 We can say that this current pattern represents addition
2:49 That’s how computers work
These are the names that we have chosen
2:53 is called the CPU instruction
2:56 is sometimes called machine code
3:00 If further presented in a form that is user-friendly to humans
3:01 is assembly code
3:05 But even assembly language isn’t that easy to understand
3:08 Have you ever seen assembly code?
3:11 How many of you want to code in sinks all the time?
We’d rather write source code
Beautiful, clear, easy to read, easy to understand
3:24 But computers only accept binary instructions
How do you build a bridge between them?
3:33 Several approaches have been tried over the years
3:35 Some languages use compilers first invented by Grace Murray Hopper
Compile the source code directly to machine code
3:45 These are the compiled languages
3:47 Some languages rely on interpreters
Interpret source code as machine code directly at runtime
3:57 These are interpreted languages
Python is an interpreted language
People also talk a lot about the Python interpreter
4:03 But there is a third language
4:06 Instructions compiled in some languages
4:10 does not work with real physical cpus
4:16 I mean you could build a CPU like this, but at least it doesn’t exist right now
4:20 These languages can be interpreted to compile instructions for non-existent cpus
4:25 is a program that simulates the CPU to execute instructions
4:28 The interpreter understands these instructions
4:30 and translate these instructions into real CPU-accepted binary code
4:36 This intermediate instruction is called bytecode
There are many languages that fall into this category
4:41 Anyone using Java?
4:45 Java-compiled bytecode runs on the Java virtual machine
4:47 Does anyone use.NET?
So there are c #
4:51 C# compiled bytecode runs on a.net virtual machine
4:55 And of course Python
4:58 Python-compiled bytecode runs on the Python virtual machine
Let’s take a closer look at how it works
5:04 This is a Python function that computes Fibonacci numbers
5:11 Very easy to understand
5:13 first check whether is less than 2, if yes, directly return
5:18 Otherwise the Fibonacci number is obtained through a loop
5:23 How does Python actually execute this function?
5:25 Has anyone seen a file with the extension PYC?
5:33 If you use Python2, you know that Python2 will be in
The 5:35 source code path places a PYC file of the same name
5:40 If Python3 is used, pyc is placed in the __pycache__ path
5:47 You may have heard that these PYCs are compiled Python
5:50 You’ve probably heard that PYC saves recompiling time
5:52 This is Python bytecode
The 5:55 PYC file is the bytecode from compiling the source code
So the next time you run this code
6:01 or the next time you import this module
6:03 Python does not need to be compiled from scratch again
6:08 Python needs bytecode in this format to execute
6:13 So how do you understand how it works?
6:15 Suppose you use the Python interpreter
6:17 Input the Fibonacci function
At 6:20 you get a function object
This object has a special method, __code__
6:27 is the Python Code object
6:32 Did anyone hear Emily Morehouse’s talk yesterday about parsing and AST (Abstract Syntax trees)?
6:36 very good speech
6:38 You can learn something about code objects
6:40 and how does Python use it
It is a different property that we shall look at today
6:45 From another Angle
6:46 is what happens next in syntax parsing
The 6:48 Code object contains everything Python needs to execute functions
6:54 It has some properties, we can see what’s in it
6:56 and how does it work
There is an attribute called co_consts
7:02 It is a tuple whose elements are all literals and constants referenced in the function body
7:06 You can see that there are
7:09 the numbers 2,0,1
7:11 a tuple of 0s and 1s
Indeed, and None
None here looks strange
After all, None is not written in the function body
7:22 But Python puts None here for a reason
Python functions that do not explicitly use return
None is returned at 7:33
7:36 So None is in the tuple
7:45 because when Python is compiled
7:47 Cannot tell if there is an explicit return expression
7:52 In fact, it’s impossible to know
7:55 These are the literals
7:59 has one more attribute, co_varnames
8:01 Its elements are local variable names
8:06 are: N, Current, and next
8:12 The other attribute is co_names
8:15 The element is the nonlocal variable name referenced in the function body
8:18 This function does not use the nonlocal variable
8:20 So it’s an empty tuple
8:22 And finally, the most interesting property
8:25 co_code
8:30 This is the bytecode of the function
8:33 It is not a string, but a Bytes object
8:36 because of the Python3 implementation
8:42 Some characters can be represented in ASCII
8:47 This has to do with the default way Python presents bytes objects
8:49 But it’s not a string, and it can’t be treated as a string
8:51 It’s just a string of bytes
8:55 If we want to know what this long string of bytes means
8:57 Might as well start with the first byte
That looks like a pipe symbol |
9:06 I don’t know if you can memorize ASCII tables
9:08 anyway I can not recite
“So I don’t know the pipe symbol | corresponding decimal number is what
9:15 But I can ask Python to tell me
Though in Python and | corresponding decimal number is 124
9:24 So the value of the first byte of bytecode is 124
9:26 Still no useful information
9:30 good thing there’s a DIS module in the standard library
The opname array contains all Python bytecode instructions
The index value is the decimal value of the bytecode
9:46 The bytecode operator corresponding to 124 is LOAD_FAST
9:48 Ok, we know that the decimal number for the first byte is 124
9:54 indicates the LOAD_FAST instruction
The second byte in the 9:57 bytecode is 0
10:00 adds up to LOAD_FAST 0
10:02 I don’t know if you noticed the first slide
10:05 is actually what’s going on here
10:08 LOAD_FAST 0 is the Python bytecode instruction
10:12 Exactly
10:15 This instruction means to look for a variable name whose index value is 0 in a tuple of variable names
10:21 is the local variable n
Push it to the top of the call stack
10:29 We’ll cover the call stack later
10:31 But now I have to show you a shortcut
10:35 The way I showed you how to read bytecode is very tedious
10:38 There’s an easy way
10:41 Import dis then call dis.dis
10:44 You can pass it anything
10:47 Take functions
10:48 or the source code string
10:50 or any Type of Python object
10:52 dis.dis() will untangle it
10:56 Print out easy-to-read bytecode
11:00 the result obtained by passing in Fibonacci function
11:02 is the first slide
11:05 This is the bytecode of the Fibonacci function
11:11 A few points worth noting:
11:12 These numbers on the left
11:172, 3, 4, 5, 6, 7, 8
11:18 Line number of the corresponding source code
11:20 is also the starting point for each instruction block
11:22 You must have noticed
11:25 Each line of source code corresponds to multiple lines of bytecode instructions
11:30 There is a number next to each command
11:32 And this number is always even
11:34 Would anyone like to guess why it’s even?
11:38 this is a new feature of Python3.6
11:41 These numbers are bytecode offsets
11:44 If you look at __code__. Co_codes carefully
11:46 Enter the index value
11:49 such as 6
At 11:53 you get POP_JUMP_IF_FALSE
11:57 even numbers
11:59 is because Python3.6
12:02 Not all bytecode instructions have arguments
12:04 but Python3.6 takes arguments to each instruction
12:07 Regardless of whether the parameters are there or not
12:08 Each bytecode instruction thus takes up two bytes
12:10 This is also easier to implement
12:16 There are also instructions that have too many parameters
12:19 Can’t fit in one byte
12:21 will be split into multiple bytes
12:22 but it must be a multiple of two bytes
12:24 for Python3.5 or earlier
12:28 For the same input
12:29 The bytecode you get might have odd offsets
12:31 Because not all instructions in Python 3.5 have arguments
12:33 Another point worth noting
12:37 These right triangle signs
For example, line 4 of source code, offset 12
12:42 LOAD_CONST here
12:44 and line 5 of the source code, offset 22
12:47 These are jump targets
12:50 This is Python’s way of telling you that other instructions might jump to these places
12:57 Remember loops in Fibonacci functions?
12:59 In the beginning is a judgment
13:01 Each run to the start of the loop
13:04 all jump back to the previous instruction
These trigonometric arrows indicate that this could be a jump point for other instructions
13:12 Ok, looked at some bytecode
13:17 Do we also know how to parse raw bytecode
13:19 Get the bytecode first
13:22 Then manually parse the instructions corresponding to these bytes
13:24 Or dis.dis
13:26 We actually talked a little bit about how Python works
13:29 and how does Python use bytecode
The Python virtual machine implemented by CPython is stack-oriented
In other words, its underlying data structure is the stack
13:40 If you haven’t used stacks before (here’s a brief introduction)
13:43 Stacks are kind of like lists
13:45 simply supports two very important operations
13:48 A stack has two ends, let’s call them top and bottom
One operation is push
So 13:52 puts the value at the top of the stack
13:55 Another operation is pop
13:57 is the value taken from the top of the stack, removed, and returned
14:01 Each call to a Python function pushes the call frame to the top of the call stack
14:07 The call stack keeps track of every function that is called
14:09 Once the function returns the corresponding call frame pops off the call stack
14:17 Return value push to call frame
So if I call the Fibonacci function
14:21 More on that later
14:23 you can get the return value
14:24 When a call frame within a call frame is executed
14:31 Will also use two other stacks
14:34 “Computing stack”, also known as “data stack”
14:40 Python uses it to store all the data it uses
14:43 Most of the computation of Python functions takes place here
14:46 And most of the instructions operate on the top of the stack
14:53 Another stack used is the “code block stack”
14:55 Used to record the currently active code block
15:00 code blocks are things like try/except, with blocks
Python needs code blocks because statements like break and continue apply to the current code block
15:11 Python needs to know what the current code block is
15:13 This can be done by maintaining a stack of code blocks
So every time you encounter this structure
Python pushes it onto the code block stack
15:21 Pop it off after it’s over
Let’s see how the function is executed
15:27 Suppose we wanted to find the eighth Fibonacci number
15:31 We will call Python’s Fibonacci function to solve
This can be converted into three bytecode instructions
15:39 LOAD_GLOBAL, LOAD_CONST and CALL_FUNCTION
15:42 look closely
At first the stack is empty
15:46 The first instruction is LOAD_GLOBAL
Load the global variable fib, also known as the Fibonacci function
15:54 Needs to be looked up in the nonlocal variable name in the co_NAMES tuple
After finding the function, push the function object to the top of the stack
Next up is LOAD_CONST
Here is the element with index 1 of the constant tuple
16:10 Remember
16:12 The element with index 0 is None
So we get an integer 8
16:17 is the argument to the function
16:19 Push it to the top of the stack
Next comes the CALL_FUNCTION directive
16:26 The parameter is 1
16:29 The way Python calls a function when only positional arguments are used is
16:34 Push the function to the top of the stack
Push the position argument to the top of the stack (above the function object).
And then when you call a function
16:42 pop all position parameters
So the next element on the stack is the function object, and pops out that function object
16:48 Push the new stack to the call frame or call stack
Execute Fibonacci function in new call frame
16:56 Obtain the return value 21
17:00 Next pop call stack, get call frame
17:03 The return value is returned to the stack
17:10 These are the details of Python’s step-by-step Fibonacci functions
17:14 The CALL_FUNCTION directive here applies only to positional arguments
17:18 If it is a keyword parameter
The CALL_FUNCTION_KW command is used at 17:20
17:26 If generator is used, parameter unpack
17:30 The * or ** operator
The CALL_FUNCTION_EX directive is used at 17:33
This is how the function works
17:42 If you’re interested
You can refer to the DIS module in the Python standard library documentation
17:47 The DIS module is very useful
17:53 It lists all the bytecode instructions
It also explains the functions of these instructions, the parameters they accept, whatever you want to know
18:00 Technical details about Python bytecode
18:03 Here are a few more interesting things
18:07 The DIS module has a function called distb
18:12 Have you ever encountered strange anomalies
18:15 Wonder where exactly it was thrown
18:18 dis.distb can help
You can call it directly after an exception has occurred
18:29 or pass in the captured Traceback object
18:33 Distb will parse the active call frames on the current call stack
18:39 Prints the executed bytecode
Arrows are drawn directly to the instruction that throws an exception
Let me give you an example
So LET me divide a number by 0
18:51 Python throws an exception
18:54 import dis; dis.distb()
You can print out executed bytecode at 18:57
19:00 If you still want to dig into the details
See the references I give at the end of the slide
19:04 You can take a look at the Python interpreter written in C
That’s the C source code for the Python bytecode interpreter on GitHub 2 hours ago
19:16 is essentially a huge switch expression
19:19 What is the operation represented by an instruction to find an incoming decimal number
19:27 Ok, now we know a little bit about bytecode
19:31 But what’s the use of bytecode?
19:34 What are the benefits of knowing bytecode?
19:40 Have you heard or used the Forth language?
19:46 or the newer Factor language?
Both Forth and Factor are stack-oriented programming languages
19:57 Python virtual machines are also stack-oriented
We talked about that a moment ago
20:01 is basically all about pushing something to the top of the stack
20:03 Do something at the top of the stack
20:05 Finally pop the results back
20:08 This process is a little different from the way we’re used to programming
But a lot of programming languages are designed around this idea
And it’s good to understand the programming idea
There may not have been a day when it was actually used
But you can learn it
Then expand your programming horizons
And stack-oriented programming languages or virtual machines
20:34 Through very few instructions
20:37 and a limited number of stack operators can do amazing things
20:39 Very clever indeed
Of course, knowing bytecode is also practical
Everyone likes to joke about C
All regard C as half of assembly language.
Because you write, read C code can see it will be transferred to what machine code
20:59 Python is the same to some extent
21:03 We can learn Python bytecode
21:05 Learn how to understand it
21:07 To find out what bytecode our Python source code will be translated into
21:11 and how does the Python interpreter execute source code
21:16 All this will give you insight
21:19 You will also learn how Python works
21:26 and what everyone wants to know about improving the performance of Python code
21:30 Look at these two functions
33They both do the same thing
21:35 Count the number of seconds in a week
21:37 But there is a faster way of writing it
21:40 Can you see which way is faster?
21:46 I want you to think about it
21:49 Why is one function faster than another
21:52 and how to find this function
21:55 The method is to look at the bytecode
21:56 Bytecode is first obtained using the DIS module
22:02 The bytecodes of the two functions are quite different
You can see that the bytecode of the first function stores the number of seconds of the day in the variable
22:09 that means you need to load constants
22:12 Store variables
22:15 Read the value of the variable again
Load another constant and multiply
22:19 The result is returned
The bytecode of the second function uses only the multiplication of two constants
22:24 While Python is compiled
22:27 Found that this is the multiplication of two constants
22:32 It’s not going to change
22:35 7 * 86400 doesn’t change anything
22:41 Python is optimized for this
Multiply at compile time
22:45 Actually returns 604800 directly
All other superfluous operations are omitted
22:51 This optimization is really clever
22:53 Python does this whenever it encounters constant operations
23:00 This is not the only optimization Python makes
23:03 Have you heard of Spectre and Meltdown?
23:05 Know anything about it?
23:08 These two loopholes are mainly caused by branch prediction
23:11 is when the processor tries to guess what the if statement might do next
23:18 Python also predicts bytecode operations
23:22 Some bytecode operators always come in pairs
For example, a comparison operation is often followed by a jump instruction
23:29 The Python bytecode interpreter will be optimized
23:31 Trying to guess the next operation
23:33 So as to make full use of the branch prediction function of CPU to improve the execution speed
23:37 So it’s pretty good
23:41 You can also answer some frequently asked performance tuning questions
“They always ask
23:46 Why is a literal list or dictionary faster than calling a list or dict
23:51 Well, here’s why
Create a literal dictionary with {}
23:57 Only two instructions are required
24:00:00 If dict is called
24:2:00 requires three instructions
24:4:00 One of them is still CALL_FUNCTION
24:06:00 this means pushing the call frame to the call stack
24:07:00 Perform the function and pop the result back
24:10:00 Let’s use real code for an example
24:12:00 Is a very simple example
24:15:00 is to take the first ten perfect squares
24:17:00 The complete bytecode is not shown here
24:20:00 is just the bytecode corresponding to the while loop
24:22:00 consists of 15 bytecode instructions
24:25:00 This code can be optimized
24:28:00 For example, replace the while loop with the for loop
24:30:00 Count with range
24:34:00 Now the bytecode of the loop body is much shorter
24:36:00 requires only nine instructions
24:39:00 if written more in line with the Python philosophy
24:42:00 For example, using a list derivation
What would the bytecode for 24:43:00 look like?
There are now only nine instructions in the bytecode of the entire function body
24:48:00 But don’t be fooled by appearances
24:54:00 I put this byte code here for a reason
24:57:00 Notice, there are only nine instructions
25:00.00 contains instructions for creating and calling functions
25:02:00 So extra call frames need to be pushed onto the call stack
25:03:00 where the function body is executed
25:05:00 Pop off and return
25:08:00 This operation will consume more resources
25:10:00 even though there are fewer bytecode instructions
25:15:00 Because not all instructions consume the same amount of resources
25:18:00 We are now talking about the performance differences between different bytecodes and bytecode instructions
25:24:00 Everybody wants to know about this micro-optimization technique
25:28:00 First of all, I want to emphasize
Set: 00 Python slowly
25:32:00 If you’re struggling to speed up the execution of Python bytecode instructions
25:35:00 then you can’t see the forest for the trees
25:37:00 Python is much slower than C
25:39:00 There is no need to think about micro-optimizations
25:43:00 If you want to write lightning-fast Python code
25:52:00 Go over the Python standard library first
25:56:00 Look at built-in functions and built-in classes
25:58.00 Learn what is implemented in C
26:01:00 Which are implemented in Python
26:03:00 because when it comes to speed differences
26:05:00 The improvement from optimizing bytecode instructions may be so small
26:10:00 and change to the C language implementation version
26:12:00 there are so many performance improvements, there is no comparison
26:16:00 Even so, you might want to have some basic ideas
26:18:00 Here are a few
26:22:00 If you’ve read some Python performance tuning guides
26:24:00 You’ve probably heard of not referencing variables inside loops
26:27:00 Instead, you create the alias first and then use it in the loop
26:30:00 that’s why (pointing to the slide)
26:32:00 LOAD commands have different performance
26:35:00 LOAD_CONST and LOAD_FAST are faster
26:38:00 LOAD_NAME and LOAD_GLOBAL are slower
26:40:00 And why
Finding nonlocal variables can be complicated
26:47.00 May need to search in multiple namespaces
26:52:00 If you look at the source code that implements the interpreter
26:56:00 will know that the implementation of these instructions is very complicated
26:57:00 Also, loops and code blocks are slow
27:01:00 can be avoided as much as possible
They use SETUP_LOOP, SETUP_WITH, SETUP_EXCEPTION
27:10:00 Each time you enter or exit a loop or block of code
27:13:00 all require multiple instructions to enter the loop
27:18:00 Handle the context and push to the code block stack
27:19:00 Execute the loop body
27:22:00 If you exit the loop, you have to jump out
27:24:00 Final pop result
27:26:00 and some finishing touches
27:27:00 are very resource-intensive instructions that can be avoided at all costs
Access properties, dictionary searches, and list indexes also need attention
27:38:00 LOAD_ATTR and BINARY_SUBSCR here
27:42:00 You hear it a lot
Get an element from a dictionary or list
27:45:00 if I want to loop through it
27:47:00 quote each time
It is better to use aliases of local variables in advance.
Because every step in the loop is a lookup, dict lookup is efficient.
27:55:00 and this command is more resource-intensive
28:01:00 There are many similar optimization tips in the dis module documentation
28:05:00 The documentation describes various instructions for your reference
28:08:00 There are some other materials worth reading
28:10:00 Here are three recommendations
28:13:00 First up is a free online ebook, Inside the Python Virtual Machine.
28:20:00 Tips for authors are certainly welcome
28:21:00 This book is a complete introduction to the inner workings of the Python interpreter
28:28:00 All internal mechanisms
Thou: 00 stack
28:32:00 Various byte instructions
28:36:00 Followed by Implementing the Python Interpreter with Python by Allison Kaptur
28:39:00 She explained the implementation in detail
28:40:00 Oh, she also has a PyCon talk
28:43:00 Did she explain how to use it
28:48:00 Reasonable data structure
28:50:00 To write a Python interpreter in Python with various bytecode operations
28:52:00 Finally, read the source code for the CPython bytecode interpreter
28:57:00 And part of that is the huge switch expression I just showed you
It has about a thousand lines
29:2:00 The version I saw was this long
At least a few hundred lines at 29:05.00
29:08:00 but it’s not hard to read
29:09.00 is very well written C code
29:11:00 CPython C source code style is relatively easy to read
29:18:00 These are good references
29:21:00 You can still find me on Twitter
29:24:00 I can answer a few questions
29:28:00 You can follow me online
29:32:00 And finally, thank you for listening
29:36:00 I hope you got something
If you find any mistakes in your translation or other areas that need to be improved, you are welcome to the Nuggets Translation Program to revise and PR your translation, and you can also get the corresponding reward points. The permanent link to this article at the beginning of this article is the MarkDown link to this article on GitHub.
The Nuggets Translation Project is a community that translates quality Internet technical articles from English sharing articles on nuggets. The content covers Android, iOS, front-end, back-end, blockchain, products, design, artificial intelligence and other fields. If you want to see more high-quality translation, please continue to pay attention to the Translation plan of Digging Gold, the official Weibo, Zhihu column.