0. References

  • What is a code object in Python?

(Most of this article is borrowed from and translated from this article)

  • Inspect — Inspect live objects
  • PyCodeObject and Python program execution

1. The concept

A code object is an internal representation of executable Python code in CPython. Executable Python code includes:

  • function
  • The module
  • class
  • Generator expression

When you run a piece of code, it is parsed and compiled into a code object, which is then executed by the CPython virtual machine. The code object contains a series of instructions that directly manipulate the internal state of the virtual machine. This is similar to when you program in C, where you write human-readable text and the compiler converts it into binary form. The binary code (C machine code or Python bytecode) is executed directly by the CPU (for C) or the VIRTUAL CPU of the CPython vm.

In addition to containing instructions, the code object provides some additional information that the virtual machine needs to run the code.


2. Explore

The following is an experiment in Python 3.7 and focuses on functions. Modules and classes are also implemented through code objects (in fact, serialized module code objects are stored in.pyc files), but most of the features of code objects are primarily related to functions.

Two points to note about the version:

  • inPython 2In, the function’s code object passes throughFunction. Func_codeTo access; whilePython 3, you need to passFunction. __code__To access.
  • Python 3A new property has been added to the code object ofco_kwonlyargcount, corresponding to the mandatory keyword parameterkeyword-only argument.

First find all the 15 properties in the console that belong to the function __code__ that do not begin with a double underscore.

>>> li = [i for i in dir((lambda: 0).__code__) if not i.startswith('__')]
>>> print(li)
['co_argcount'.'co_cellvars'.'co_code'.'co_consts'.'co_filename'.'co_firstlineno'.'co_flags'.'co_freevars'.'co_kwonlyargcount'.'co_lnotab'.'co_name'.'co_names'.'co_nlocals'.'co_stacksize'.'co_varnames']
>>> len(li)
15
Copy the code

The following is from the official document:

attribute describe
co_argcount number of arguments (not including keyword only arguments, * or ** args)
co_code string of raw compiled bytecode
co_cellvars tuple of names of cell variables (referenced by containing scopes)
co_consts tuple of constants used in the bytecode
co_filename name of file in which this code object was created
co_firstlineno number of first line in Python source code
co_flags bitmap of CO_* flags, read more here
co_lnotab encoded mapping of line numbers to bytecode indices
co_freevars Tuple of names of free variables (referenced via a function’s closure)
co_kwonlyargcount number of keyword only arguments (not including ** arg)
co_name name with which this code object was defined
co_names tuple of names of local variables
co_nlocals number of local variables
co_stacksize virtual machine stack space required
co_varnames tuple of names of arguments and local variables

Here are the explanations:

  • co_argcount: Number of arguments received by the function, excluding*args**kwargsAnd enforce keyword arguments.
>>> def test(a, b, c, d=1, e=2, *args, f=3, g, h=4, **kwargs) :
.    print(a, b, c, d, e, f, g, h, args, kwargs)
.
>>> code_obj = test.__code__
>>> code_obj.co_argcount
5
Copy the code
  • co_code: Byte code in binary formatbytecodeTo a byte stringbytesThe form of storage (inPython 2In order tostrType storage). It provides a set of instructions for the virtual machine. The function is executed from the first instructionRETURN_VALUECommand to stop execution.

For other bytecode Instructions, please refer to the official document Python Bytecode Instructions

The number of bytes per instruction in bytecode is different. Each instruction has an opcode, which specifies the operation the virtual machine needs to perform, and an optional parameter, which is an integer. Opcode is a single-byte integer, so there are up to 256 different opcodes, although many of them are not used. Each opcode has a name, which is seen in the output of the DIS function in the DIS module, and they are defined in the OpCode library module.

>>> from opcode import opname
>>> opname
['< 0 >'.'POP_TOP'.'ROT_TWO'.'ROT_THREE'.'DUP_TOP'.'DUP_TOP_TWO'.'< 6 >'.'< 7 >'.'< 8 >'.'NOP'.'UNARY_POSITIVE'.'UNARY_NEGATIVE'.'UNARY_NOT'.'< 13 >'.'< 14 >'.'UNARY_INVERT'.'BINARY_MATRIX_MULTIPLY'.'INPLACE_MATRIX_MULTIPLY'.'< 18 >'.'BINARY_POWER'.'BINARY_MULTIPLY'.'< 21 >'.'BINARY_MODULO'.'BINARY_ADD'.'BINARY_SUBTRACT'.'BINARY_SUBSCR'.'BINARY_FLOOR_DIVIDE'.'BINARY_TRUE_DIVIDE'.'INPLACE_FLOOR_DIVIDE'.'INPLACE_TRUE_DIVIDE'.'< 30 >'.'< 31 >'.'< 32 >'.'< 33 >'.'< 34 >'.'< 35 >'.'< 36 >'.'< 37 >'.'< 38 >'.39 > '<'.'< 40 >'.'< > 41'.'< > 42'.43 > '<'.44 > '<'.'< 45 >'.46 > '<'.'< 47 >'.48 > '<'.'< > 49'.'GET_AITER'.'GET_ANEXT'.'BEFORE_ASYNC_WITH'.53 > '<'.'< > 54'.'INPLACE_ADD'.'INPLACE_SUBTRACT'.'INPLACE_MULTIPLY'.58 > '<'.'INPLACE_MODULO'.'STORE_SUBSCR'.'DELETE_SUBSCR'.'BINARY_LSHIFT'.'BINARY_RSHIFT'.'BINARY_AND'.'BINARY_XOR'.'BINARY_OR'.'INPLACE_POWER'.'GET_ITER'.'GET_YIELD_FROM_ITER'.'PRINT_EXPR'.'LOAD_BUILD_CLASS'.'YIELD_FROM'.'GET_AWAITABLE'.'< 74 >'.'INPLACE_LSHIFT'.'INPLACE_RSHIFT'.'INPLACE_AND'.'INPLACE_XOR'.'INPLACE_OR'.'BREAK_LOOP'.'WITH_CLEANUP_START'.'WITH_CLEANUP_FINISH'.'RETURN_VALUE'.'IMPORT_STAR'.'SETUP_ANNOTATIONS'.'YIELD_VALUE'.'POP_BLOCK'.'END_FINALLY'.'POP_EXCEPT'.'STORE_NAME'.'DELETE_NAME'.'UNPACK_SEQUENCE'.'FOR_ITER'.'UNPACK_EX'.'STORE_ATTR'.'DELETE_ATTR'.'STORE_GLOBAL'.'DELETE_GLOBAL'.'< 99 >'.'LOAD_CONST'.'LOAD_NAME'.'BUILD_TUPLE'.'BUILD_LIST'.'BUILD_SET'.'BUILD_MAP'.'LOAD_ATTR'.'COMPARE_OP'.'IMPORT_NAME'.'IMPORT_FROM'.'JUMP_FORWARD'.'JUMP_IF_FALSE_OR_POP'.'JUMP_IF_TRUE_OR_POP'.'JUMP_ABSOLUTE'.'POP_JUMP_IF_FALSE'.'POP_JUMP_IF_TRUE'.'LOAD_GLOBAL'.'< 117 >'.'< 118 >'.'CONTINUE_LOOP'.'SETUP_LOOP'.'SETUP_EXCEPT'.'SETUP_FINALLY'.'< 123 >'.'LOAD_FAST'.'STORE_FAST'.'DELETE_FAST'.'< 127 >'.'< 128 >'.'< 129 >'.'RAISE_VARARGS'.'CALL_FUNCTION'.'MAKE_FUNCTION'.'BUILD_SLICE'.'< 134 >'.'LOAD_CLOSURE'.'LOAD_DEREF'.'STORE_DEREF'.'DELETE_DEREF'.'< 139 >'.'< 140 >'.'CALL_FUNCTION_KW'.'CALL_FUNCTION_EX'.'SETUP_WITH'.'EXTENDED_ARG'.'LIST_APPEND'.'SET_ADD'.'MAP_ADD'.'LOAD_CLASSDEREF'.'BUILD_LIST_UNPACK'.'BUILD_MAP_UNPACK'.'BUILD_MAP_UNPACK_WITH_CALL'.'BUILD_TUPLE_UNPACK'.'BUILD_SET_UNPACK'.'SETUP_ASYNC_WITH'.'FORMAT_VALUE'.'BUILD_CONST_KEY_MAP'.'BUILD_STRING'.'BUILD_TUPLE_UNPACK_WITH_CALL'.'< 159 >'.'LOAD_METHOD'.'CALL_METHOD'.'< 162 >'.'< 163 >'.'< 164 >'.'< 165 >'.'< 166 >'.'< 167 >'.'< 168 >'.'< 169 >'.'< 170 >'.'< 171 >'.'< 172 >'.'< 173 >'.'< 174 >'.'< 175 >'.'< 176 >'.'< 177 >'.'< 178 >'.'< 179 >'.'< 180 >'.'< 181 >'.'< 182 >'.'< 183 >'.'< 184 >'.'< 185 >'.'< 186 >'.'< 187 >'.'< 188 >'.'< 189 >'.'< 190 >'.'< 191 >'.'< 192 >'.'< 193 >'.'< 194 >'.'< 195 >'.'< 196 >'.'< 197 >'.'< 198 >'.'< 199 >'.'< 200 >'.'< 201 >'.'< 202 >'.'< 203 >'.'< 204 >'.'< 205 >'.'< 206 >'.'< 207 >'.'< 208 >'.'< 209 >'.'< 210 >'.'< 211 >'.'< 212 >'.'< 213 >'.'< 214 >'.'< 215 >'.'< 216 >'.'< 217 >'.'< 218 >'.'< 219 >'.'< 220 >'.'< 221 >'.'< 222 >'.'< 223 >'.'< 224 >'.'< 225 >'.'< 226 >'.'< 227 >'.'< 228 >'.'< 229 >'.'< 230 >'.'< 231 >'.'< 232 >'.'< 233 >'.'< 234 >'.'< 235 >'.'< 236 >'.'< 237 >'.'< 238 >'.'< 239 >'.'< 240 >'.'< 241 >'.'< 242 >'.'< 243 >'.'< 244 >'.'< 245 >'.'< 246 >'.'< 247 >'.'< 248 >'.'< 249 >'.'< 250 >'.'< 251 >'.'< 252 >'.'< 253 >'.'< 254 >'.'< 255 >']
Copy the code

The opcode that does not receive parameters takes one byte, while the opcode that receives parameters takes three bytes. The second and third bytes store parameters in little-endian order. If an argument cannot be represented in two bytes, say greater than 65535, then the special opcode EXTENDED_ARG is used.

  • co_cellvarsco_freevars: These two properties are used to implement the scope of the nested function.

The co_cellVars tuple stores all the variable names used by the nested functions. The co_freevars tuple stores the names of all variables defined in the closure scope that are used by the function. The variable names in these tuples are arranged alphabetically. As shown in the following example, A and C are the Cellvars of F and freevars of G.

def f(a, b) :
    c = 3
    def g() :
        return a + c
    return g

print(f.__code__.co_cellvars)
print(f.__code__.co_consts[2].co_freevars)
"""
('a', 'c')
('a', 'c')
"""
Copy the code
  • co_consts: All constants used in functions, such as integers, strings, Booleans, etc. It will beLOAD_CONSTOpcode. This opcode takes an index value as an argument, indicating that it needs to be used fromco_constsWhich element to load in a tuple.

The first element of the co_consts tuple is the docstring of the function, docstring, or None if None.

  • co_filename: The file name of the code object.

test.py

f = lambda: 0
print(f.__code__.co_filename)
"""
test.py
"""
Copy the code
  • co_firstlineno: The first line of the code object is the line number of the file in which it resides.
# comment

f = lambda: 0
print(f.__code__.co_firstlineno)
Three "" "" ""

Copy the code
  • co_flags: This is an integer that holds the combined Boolean flag bit of the function.

The meaning of these flag bits can be seen in the Inspect module documentation: Code Objects Bit Flags

  • co_lnotab: This property isline number tableAbbreviation for line number table. It takes bytesbytesIn the form of storage, each two bytes are a pair, respectivelyco_codeThe offset of the byte stringPythonThe offset of the line number.

See lnotab_notes.txt for details

  • co_kwonlyargcount: Specifies the number of mandatory keyword parameters. inPython 2In does not have this property.
>>> def test(a, b, c, d=1, e=2, *args, f=3, g, h=4, **kwargs) :
.    print(a, b, c, d, e, f, g, h, args, kwargs)
.
>>> code_obj = test.__code__
>>> code_obj.co_kwonlyargcount
3
Copy the code
  • co_name: is the name of the object associated with the code object.
>>> func = lambda: 0
>>> func.__code__.co_name
'<lambda>'
>>> def test() : pass
.
>>> test.__code__.co_name
'test'
Copy the code
  • co_namesThis property is a tuple of strings containing global variables and imported names in the order in which they are used. (Note that the official document table says the name of the local variable, which is not true)
a = 1

def f(x) :
    x = a
    print('hello')

print(f.__code__.co_names)

"""
('a', 'print')
"""

Copy the code
  • co_nlocals: The number of local variables in a functionco_varnamesThe length of the.
  • co_stacksize: An integer representing the maximum stack space that the function will use.
  • co_varnames: a tuple of all the local variable names of a function, including its parameters.

First are the positional arguments, default arguments, and mandatory keyword arguments, then *args and **kwargs (if any), and finally the other local variables in the order they were first used.

>>> def test(a, b, c, d=1, e=2, *args, f=3, g, h=4, **kwargs) :
.    print(a, b, c, d, e, f, g, h, args, kwargs)
.    x = Awesome!
.
>>> code_obj = test.__code__
>>> code_obj.co_varnames
('a'.'b'.'c'.'d'.'e'.'f'.'g'.'h'.'args'.'kwargs'.'x')
Copy the code

Completed in 2019.02.04