0. References
- What is a code object in Python?
(Most of this article is borrowed from and translated from this article)
- Inspect — Inspect live objects
- PyCodeObject and Python program execution
1. The concept
A code object is an internal representation of executable Python code in CPython. Executable Python code includes:
- function
- The module
- class
- Generator expression
When you run a piece of code, it is parsed and compiled into a code object, which is then executed by the CPython virtual machine. The code object contains a series of instructions that directly manipulate the internal state of the virtual machine. This is similar to when you program in C, where you write human-readable text and the compiler converts it into binary form. The binary code (C machine code or Python bytecode) is executed directly by the CPU (for C) or the VIRTUAL CPU of the CPython vm.
In addition to containing instructions, the code object provides some additional information that the virtual machine needs to run the code.
2. Explore
The following is an experiment in Python 3.7 and focuses on functions. Modules and classes are also implemented through code objects (in fact, serialized module code objects are stored in.pyc files), but most of the features of code objects are primarily related to functions.
Two points to note about the version:
- in
Python 2
In, the function’s code object passes throughFunction. Func_code
To access; whilePython 3
, you need to passFunction. __code__
To access. Python 3
A new property has been added to the code object ofco_kwonlyargcount
, corresponding to the mandatory keyword parameterkeyword-only argument
.
First find all the 15 properties in the console that belong to the function __code__ that do not begin with a double underscore.
>>> li = [i for i in dir((lambda: 0).__code__) if not i.startswith('__')]
>>> print(li)
['co_argcount'.'co_cellvars'.'co_code'.'co_consts'.'co_filename'.'co_firstlineno'.'co_flags'.'co_freevars'.'co_kwonlyargcount'.'co_lnotab'.'co_name'.'co_names'.'co_nlocals'.'co_stacksize'.'co_varnames']
>>> len(li)
15
Copy the code
The following is from the official document:
attribute | describe |
---|---|
co_argcount |
number of arguments (not including keyword only arguments, * or ** args) |
co_code |
string of raw compiled bytecode |
co_cellvars |
tuple of names of cell variables (referenced by containing scopes) |
co_consts |
tuple of constants used in the bytecode |
co_filename |
name of file in which this code object was created |
co_firstlineno |
number of first line in Python source code |
co_flags |
bitmap of CO_* flags, read more here |
co_lnotab |
encoded mapping of line numbers to bytecode indices |
co_freevars |
Tuple of names of free variables (referenced via a function’s closure) |
co_kwonlyargcount |
number of keyword only arguments (not including ** arg) |
co_name |
name with which this code object was defined |
co_names |
tuple of names of local variables |
co_nlocals |
number of local variables |
co_stacksize |
virtual machine stack space required |
co_varnames |
tuple of names of arguments and local variables |
Here are the explanations:
co_argcount
: Number of arguments received by the function, excluding*args
和**kwargs
And enforce keyword arguments.
>>> def test(a, b, c, d=1, e=2, *args, f=3, g, h=4, **kwargs) :
. print(a, b, c, d, e, f, g, h, args, kwargs)
.
>>> code_obj = test.__code__
>>> code_obj.co_argcount
5
Copy the code
co_code
: Byte code in binary formatbytecode
To a byte stringbytes
The form of storage (inPython 2
In order tostr
Type storage). It provides a set of instructions for the virtual machine. The function is executed from the first instructionRETURN_VALUE
Command to stop execution.
For other bytecode Instructions, please refer to the official document Python Bytecode Instructions
The number of bytes per instruction in bytecode is different. Each instruction has an opcode, which specifies the operation the virtual machine needs to perform, and an optional parameter, which is an integer. Opcode is a single-byte integer, so there are up to 256 different opcodes, although many of them are not used. Each opcode has a name, which is seen in the output of the DIS function in the DIS module, and they are defined in the OpCode library module.
>>> from opcode import opname
>>> opname
['< 0 >'.'POP_TOP'.'ROT_TWO'.'ROT_THREE'.'DUP_TOP'.'DUP_TOP_TWO'.'< 6 >'.'< 7 >'.'< 8 >'.'NOP'.'UNARY_POSITIVE'.'UNARY_NEGATIVE'.'UNARY_NOT'.'< 13 >'.'< 14 >'.'UNARY_INVERT'.'BINARY_MATRIX_MULTIPLY'.'INPLACE_MATRIX_MULTIPLY'.'< 18 >'.'BINARY_POWER'.'BINARY_MULTIPLY'.'< 21 >'.'BINARY_MODULO'.'BINARY_ADD'.'BINARY_SUBTRACT'.'BINARY_SUBSCR'.'BINARY_FLOOR_DIVIDE'.'BINARY_TRUE_DIVIDE'.'INPLACE_FLOOR_DIVIDE'.'INPLACE_TRUE_DIVIDE'.'< 30 >'.'< 31 >'.'< 32 >'.'< 33 >'.'< 34 >'.'< 35 >'.'< 36 >'.'< 37 >'.'< 38 >'.39 > '<'.'< 40 >'.'< > 41'.'< > 42'.43 > '<'.44 > '<'.'< 45 >'.46 > '<'.'< 47 >'.48 > '<'.'< > 49'.'GET_AITER'.'GET_ANEXT'.'BEFORE_ASYNC_WITH'.53 > '<'.'< > 54'.'INPLACE_ADD'.'INPLACE_SUBTRACT'.'INPLACE_MULTIPLY'.58 > '<'.'INPLACE_MODULO'.'STORE_SUBSCR'.'DELETE_SUBSCR'.'BINARY_LSHIFT'.'BINARY_RSHIFT'.'BINARY_AND'.'BINARY_XOR'.'BINARY_OR'.'INPLACE_POWER'.'GET_ITER'.'GET_YIELD_FROM_ITER'.'PRINT_EXPR'.'LOAD_BUILD_CLASS'.'YIELD_FROM'.'GET_AWAITABLE'.'< 74 >'.'INPLACE_LSHIFT'.'INPLACE_RSHIFT'.'INPLACE_AND'.'INPLACE_XOR'.'INPLACE_OR'.'BREAK_LOOP'.'WITH_CLEANUP_START'.'WITH_CLEANUP_FINISH'.'RETURN_VALUE'.'IMPORT_STAR'.'SETUP_ANNOTATIONS'.'YIELD_VALUE'.'POP_BLOCK'.'END_FINALLY'.'POP_EXCEPT'.'STORE_NAME'.'DELETE_NAME'.'UNPACK_SEQUENCE'.'FOR_ITER'.'UNPACK_EX'.'STORE_ATTR'.'DELETE_ATTR'.'STORE_GLOBAL'.'DELETE_GLOBAL'.'< 99 >'.'LOAD_CONST'.'LOAD_NAME'.'BUILD_TUPLE'.'BUILD_LIST'.'BUILD_SET'.'BUILD_MAP'.'LOAD_ATTR'.'COMPARE_OP'.'IMPORT_NAME'.'IMPORT_FROM'.'JUMP_FORWARD'.'JUMP_IF_FALSE_OR_POP'.'JUMP_IF_TRUE_OR_POP'.'JUMP_ABSOLUTE'.'POP_JUMP_IF_FALSE'.'POP_JUMP_IF_TRUE'.'LOAD_GLOBAL'.'< 117 >'.'< 118 >'.'CONTINUE_LOOP'.'SETUP_LOOP'.'SETUP_EXCEPT'.'SETUP_FINALLY'.'< 123 >'.'LOAD_FAST'.'STORE_FAST'.'DELETE_FAST'.'< 127 >'.'< 128 >'.'< 129 >'.'RAISE_VARARGS'.'CALL_FUNCTION'.'MAKE_FUNCTION'.'BUILD_SLICE'.'< 134 >'.'LOAD_CLOSURE'.'LOAD_DEREF'.'STORE_DEREF'.'DELETE_DEREF'.'< 139 >'.'< 140 >'.'CALL_FUNCTION_KW'.'CALL_FUNCTION_EX'.'SETUP_WITH'.'EXTENDED_ARG'.'LIST_APPEND'.'SET_ADD'.'MAP_ADD'.'LOAD_CLASSDEREF'.'BUILD_LIST_UNPACK'.'BUILD_MAP_UNPACK'.'BUILD_MAP_UNPACK_WITH_CALL'.'BUILD_TUPLE_UNPACK'.'BUILD_SET_UNPACK'.'SETUP_ASYNC_WITH'.'FORMAT_VALUE'.'BUILD_CONST_KEY_MAP'.'BUILD_STRING'.'BUILD_TUPLE_UNPACK_WITH_CALL'.'< 159 >'.'LOAD_METHOD'.'CALL_METHOD'.'< 162 >'.'< 163 >'.'< 164 >'.'< 165 >'.'< 166 >'.'< 167 >'.'< 168 >'.'< 169 >'.'< 170 >'.'< 171 >'.'< 172 >'.'< 173 >'.'< 174 >'.'< 175 >'.'< 176 >'.'< 177 >'.'< 178 >'.'< 179 >'.'< 180 >'.'< 181 >'.'< 182 >'.'< 183 >'.'< 184 >'.'< 185 >'.'< 186 >'.'< 187 >'.'< 188 >'.'< 189 >'.'< 190 >'.'< 191 >'.'< 192 >'.'< 193 >'.'< 194 >'.'< 195 >'.'< 196 >'.'< 197 >'.'< 198 >'.'< 199 >'.'< 200 >'.'< 201 >'.'< 202 >'.'< 203 >'.'< 204 >'.'< 205 >'.'< 206 >'.'< 207 >'.'< 208 >'.'< 209 >'.'< 210 >'.'< 211 >'.'< 212 >'.'< 213 >'.'< 214 >'.'< 215 >'.'< 216 >'.'< 217 >'.'< 218 >'.'< 219 >'.'< 220 >'.'< 221 >'.'< 222 >'.'< 223 >'.'< 224 >'.'< 225 >'.'< 226 >'.'< 227 >'.'< 228 >'.'< 229 >'.'< 230 >'.'< 231 >'.'< 232 >'.'< 233 >'.'< 234 >'.'< 235 >'.'< 236 >'.'< 237 >'.'< 238 >'.'< 239 >'.'< 240 >'.'< 241 >'.'< 242 >'.'< 243 >'.'< 244 >'.'< 245 >'.'< 246 >'.'< 247 >'.'< 248 >'.'< 249 >'.'< 250 >'.'< 251 >'.'< 252 >'.'< 253 >'.'< 254 >'.'< 255 >']
Copy the code
The opcode that does not receive parameters takes one byte, while the opcode that receives parameters takes three bytes. The second and third bytes store parameters in little-endian order. If an argument cannot be represented in two bytes, say greater than 65535, then the special opcode EXTENDED_ARG is used.
co_cellvars
和co_freevars
: These two properties are used to implement the scope of the nested function.
The co_cellVars tuple stores all the variable names used by the nested functions. The co_freevars tuple stores the names of all variables defined in the closure scope that are used by the function. The variable names in these tuples are arranged alphabetically. As shown in the following example, A and C are the Cellvars of F and freevars of G.
def f(a, b) :
c = 3
def g() :
return a + c
return g
print(f.__code__.co_cellvars)
print(f.__code__.co_consts[2].co_freevars)
"""
('a', 'c')
('a', 'c')
"""
Copy the code
co_consts
: All constants used in functions, such as integers, strings, Booleans, etc. It will beLOAD_CONST
Opcode. This opcode takes an index value as an argument, indicating that it needs to be used fromco_consts
Which element to load in a tuple.
The first element of the co_consts tuple is the docstring of the function, docstring, or None if None.
co_filename
: The file name of the code object.
test.py
f = lambda: 0
print(f.__code__.co_filename)
"""
test.py
"""
Copy the code
co_firstlineno
: The first line of the code object is the line number of the file in which it resides.
# comment
f = lambda: 0
print(f.__code__.co_firstlineno)
Three "" "" ""
Copy the code
co_flags
: This is an integer that holds the combined Boolean flag bit of the function.
The meaning of these flag bits can be seen in the Inspect module documentation: Code Objects Bit Flags
co_lnotab
: This property isline number table
Abbreviation for line number table. It takes bytesbytes
In the form of storage, each two bytes are a pair, respectivelyco_code
The offset of the byte stringPython
The offset of the line number.
See lnotab_notes.txt for details
co_kwonlyargcount
: Specifies the number of mandatory keyword parameters. inPython 2
In does not have this property.
>>> def test(a, b, c, d=1, e=2, *args, f=3, g, h=4, **kwargs) :
. print(a, b, c, d, e, f, g, h, args, kwargs)
.
>>> code_obj = test.__code__
>>> code_obj.co_kwonlyargcount
3
Copy the code
co_name
: is the name of the object associated with the code object.
>>> func = lambda: 0
>>> func.__code__.co_name
'<lambda>'
>>> def test() : pass
.
>>> test.__code__.co_name
'test'
Copy the code
co_names
This property is a tuple of strings containing global variables and imported names in the order in which they are used. (Note that the official document table says the name of the local variable, which is not true)
a = 1
def f(x) :
x = a
print('hello')
print(f.__code__.co_names)
"""
('a', 'print')
"""
Copy the code
co_nlocals
: The number of local variables in a functionco_varnames
The length of the.co_stacksize
: An integer representing the maximum stack space that the function will use.co_varnames
: a tuple of all the local variable names of a function, including its parameters.
First are the positional arguments, default arguments, and mandatory keyword arguments, then *args and **kwargs (if any), and finally the other local variables in the order they were first used.
>>> def test(a, b, c, d=1, e=2, *args, f=3, g, h=4, **kwargs) :
. print(a, b, c, d, e, f, g, h, args, kwargs)
. x = Awesome!
.
>>> code_obj = test.__code__
>>> code_obj.co_varnames
('a'.'b'.'c'.'d'.'e'.'f'.'g'.'h'.'args'.'kwargs'.'x')
Copy the code
Completed in 2019.02.04