This article is intended for those who have a foundation in Python

Author: pwwang

One, foreword

This article is based on an open source project:

Github.com/pwwang/pyth…

The import mechanism of Python is designed to help readers understand it.

1.1 What is the import mechanism?

In general, Python’s import mechanism is required to execute a piece of Python code that references code in another module. Import statements are the most common way to trigger import mechanisms, but they are not the only way.

Importlib. import_module and the __import__ functions can also be used to import code from other modules.

1.2 How is import implemented?

The import statement performs two steps:

  1. Search for modules to introduce
  2. Bind the module name as a variable to a local variable

The search step is actually done through the __import__ function, whose return value is bound to the local variable as a variable. We’ll talk more about how the __import__ function works below.

2. Overview of import mechanism

The following is an overview of the import mechanism. As you can see, when the import mechanism is triggered, Python first checks sys.modules to see if the module has already been introduced. If it has already been introduced, it calls it directly, otherwise proceed to the next step. Sys. modules can be thought of as a cache container. Note that a ModuleNotFoundError exception is raised if the value in sys.modules is None. Here’s a simple experiment:

In [1] :import sys

In [2]: sys.modules['os'] = None

In [3] :import os
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-3-543d7f3a58ae> in <module>
----> 1 import os

ModuleNotFoundError: import of os halted; None in sys.modules
Copy the code

If the corresponding module is found in sys.modules and the import is triggered by the import statement, the next step is to bind the corresponding variable to the local variable.

If no cache is found, the system does a whole new import process. In this process, Python traverses sys.meta_path to see if there is a meta Path finder that matches the criteria. Sys. meta_path is a list of metapath finders. It has three default finders:

  • Built-in module finder
  • The Frozen Module finder
  • Path-based module finder.
In [1] :import sys

In [2]: sys.meta_path
Out[2]: 
[_frozen_importlib.BuiltinImporter,
 _frozen_importlib.FrozenImporter,
 _frozen_importlib_external.PathFinder]
Copy the code

The find_spec method of the finder determines whether the finder can process the module to be imported and returns a ModeuleSpec object containing the information used to load the module. If no appropriate ModuleSpec object is returned, the system looks at sys.meta_path’s next meta-path finder. If sys.meta_path is traversed and no suitable meta-path finder is found, ModuleNotFoundError is raised. This happens by introducing a nonexistent module, because all the finders in sys.meta_path can’t handle this:

In [1] :import nosuchmodule
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-1-40c387f4d718> in <module>
----> 1 import nosuchmodule

ModuleNotFoundError: No module named 'nosuchmodule'
Copy the code

However, if this manually adds a finder that can handle the module, then it can also be introduced:

In [1] :importsys ... :... :from importlib.abc importMetaPathFinder ... :from importlib.machinery importModuleSpec ... :... :class NoSuchModuleFinder(MetaPathFinder) :. :def find_spec(self, fullname, path, target=None) :. :return ModuleSpec('nosuchmodule'.None)
   ...: 
   ...: # don't do this in your script
   ...: sys.meta_path = [NoSuchModuleFinder()]
   ...: 
   ...: import nosuchmodule
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-6-b7cbf7e60adc> in <module>
     11 sys.meta_path = [NoSuchModuleFinder()]
     12 
---> 13 import nosuchmodule

ImportError: missing loader
Copy the code

As you can see, the ModuleNotFound exception is not raised when we tell the system how to find_spec. However, to successfully load a module, you also need a loader.

The loader is a property of the ModuleSpec object that determines how a module is loaded and executed. If ModuleSpec object is “master leads the door”, then loader is “practice in the individual”. In the loader, you can completely decide how to load and execute a module. The decision here is not just to load and execute the module itself, you can even modify a module:

In [1] :importsys ... :from types importModuleType ... :from importlib.machinery importModuleSpec ... :from importlib.abc importMetaPathFinder, Loader ... :... :class Module(ModuleType) :. :def __init__(self, name) :. : self.x =1
   ...:         self.name = name
   ...: 
   ...: class ExampleLoader(Loader) :. :def create_module(self, spec) :. :returnModule(spec.name) ... :... :def exec_module(self, module) :. : module.y =2. :... :class ExampleFinder(MetaPathFinder) :. :def find_spec(self, fullname, path, target=None) :. :return ModuleSpec('module', ExampleLoader()) ... :... : sys.meta_path = [ExampleFinder()] In [2] :import module

In [3]: module
Out[3]: <module 'module' (<__main__.ExampleLoader object at 0x7f7f0d07f890>)>

In [4]: module.x
Out[4] :1

In [5]: module.y
Out[5] :2
Copy the code

As you can see from the above example, a loader usually has two important methods create_module and exec_module to implement. Create_module is required if the exec_module method is implemented. If the import mechanism is initiated by an import statement, the variable of the module object returned by the create_module method will be bound to the current local variable. If a module is thus successfully loaded, it will be cached to sys.modules. If the module is loaded again, the sys.modules cache will be referenced directly.

Import hooks

For simplicity, we did not mention the hooks of the import mechanism in the above flowchart. You can actually change the behavior of the import mechanism by adding a check box to change sys.meta_path or sys.path. In the example above, we changed sys.meta_path directly. In fact, you can also do this with a hook:

In [1] :importsys ... :from types importModuleType ... :from importlib.machinery importModuleSpec ... :from importlib.abc importMetaPathFinder, Loader ... :... :class Module(ModuleType) :. :def __init__(self, name) :. : self.x =1
   ...:         self.name = name
   ...: 
   ...: class ExampleLoader(Loader) :. :def create_module(self, spec) :. :returnModule(spec.name) ... :... :def exec_module(self, module) :. : module.y =2. :... :class ExampleFinder(MetaPathFinder) :. :def find_spec(self, fullname, path, target=None) :. :return ModuleSpec('module', ExampleLoader()) ... :... :def example_hook(path) :. :# some conditions here. :returnExampleFinder() ... :... : sys.path_hooks = [example_hook] ... :# force to use the hook
   ...: sys.path_importer_cache.clear()
   ...: 
   ...: importmodule ... : module Out[1]: <module 'module' (<__main__.ExampleLoader object at 0x7fdb08f74b90> >)Copy the code

Meta Path Finder

The job of the meta-pathfinder is to see if modules can be found. These finders are stored in sys.meta_path for Python to traverse (they can also be returned via import hooks, see the example above). Each finder must implement the find_spec method. If a finder knows what to do with the module being introduced, find_spec returns a ModuleSpec object (see the next section) otherwise returns None.

As mentioned earlier sys.meta_path contains three types of finders:

  • Built-in module finder
  • Freeze the module finder
  • Path-based finder

Here we want to focus on path Based Finder. It is used to search a series of import paths, each of which is used to see if there is a corresponding module to load. The default pathfinder implements all the functionality of finding modules in special files on the file system, including Python source files (.py files), Python compiled code files (.pyc files), and shared library files (.so files). If zipImport is included in the Python standard library, the associated files can also be used to find modules that can be imported.

Pathfinders are not limited to files in the file system; they can also be queries in the URL database, or any other address that can be represented as a string.

You can use the hooks provided in the section to implement module lookups for addresses of the same type. For example, if you want to import modules by URL, you can write an import hook that parses the URL and returns a pathfinder.

Note that pathfinders are different from meta-pathfinders. The latter is used in sys.meta_path to be traversed by Python, while the former specifically refers to path-based finders.

ModuleSpec object

Each meta path finder must implement the find_spec method, which will return a ModuleSpec object if the finder knows how to process the module to be imported. This object has two properties worth mentioning, one is the module name and the other is the finder. If the finder of a ModuleSpec object is None, an exception like ImportError: Missing Loader will be thrown. The finder will be used to create and execute a module (see the next section).

You can find the ModuleSpec object for a module by using

.__spec__ :

In [1] :import sys

In [2]: sys.__spec__
Out[2]: ModuleSpec(name='sys', loader=<class '_frozen_importlib.BuiltinImporter'>)
Copy the code

Vi. Loader

The loader creates modules with create_module and exec_module. Normally if a module is a Python module (not a built-in module or a dynamic extension), the module’s code needs to be executed on the module’s __dict__ space. If the module’s code cannot be executed, ImportError is thrown, or other exceptions that are in progress are also thrown.

Most of the time, the finder and loader are the same thing. In this case, the Loader attribute of the ModuleSpec object returned by the find_spec method of the finder will point to itself.

We can create a module dynamically with create_module, and Python will automatically create a module if it returns None.

Seven,

Python’s import mechanism is flexible and powerful. Much of the introduction above is based on official documentation and the newer Python 3.6+ version. Due to space, there are many details not included, such as the loading of submodules, the caching mechanism of module code, and so on. If you have any questions, please go to github.com/pwwang/pyth… Issue questions and discussions.


Follow the HelloGitHub public account to receive the first updates.

There are more open source projects and treasure projects waiting to be discovered.